General Text Parser (GTP)
Features (outputs):
Produces a term-by-document (sparse) matrix stored in CCS format (matrix.hb)
Produces binary output files of vector encodings for k-dimensional space
Produces an ASCII summary file for each GTP execution (parameter settings, timestamps)
CCS format is also known as the Harwell-Boeing (HB) sparse matrix format.