@Title(value="Term frequency parser") @Description(value="Parse a file containing term frequencies. The expected format is \'label term1term2 ...\'. Terms must not contain the separator character!") public class TermFrequencyParser extends NumberVectorLabelParser<SparseFloatVector>
Modifier and Type | Class and Description |
---|---|
static class |
TermFrequencyParser.Parameterizer
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
(package private) HashMap<String,Integer> |
keymap
Map
|
private static Logging |
logger
Class logger
|
(package private) int |
maxdim
Maximum dimension used
|
LABEL_INDICES_ID, labelIndices
ATTRIBUTE_CONCATENATION, COLUMN_SEPARATOR_ID, COMMENT, NUMBER_PATTERN, QUOTE_CHAR, QUOTE_ID, quoteChar, WHITESPACE_PATTERN
Constructor and Description |
---|
TermFrequencyParser(Pattern colSep,
char quoteChar,
BitSet labelIndices)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected SparseFloatVector |
createDBObject(List<Double> attributes)
Creates a database object of type V.
|
protected Logging |
getLogger()
Get the logger for this class.
|
protected VectorFieldTypeInformation<SparseFloatVector> |
getTypeInformation(int dimensionality)
Get a prototype object for the given dimensionality.
|
MultipleObjectsBundle |
parse(InputStream in)
Returns a list of the objects parsed from the specified input stream.
|
Pair<SparseFloatVector,LabelList> |
parseLineInternal(String line)
Internal method for parsing a single line.
|
parseLine
tokenize, toString
private static final Logging logger
int maxdim
protected SparseFloatVector createDBObject(List<Double> attributes)
NumberVectorLabelParser
Creates a database object of type V.
createDBObject
in class NumberVectorLabelParser<SparseFloatVector>
attributes
- the attributes of the vector to create.public Pair<SparseFloatVector,LabelList> parseLineInternal(String line)
NumberVectorLabelParser
parseLineInternal
in class NumberVectorLabelParser<SparseFloatVector>
line
- Line to processpublic MultipleObjectsBundle parse(InputStream in)
Parser
parse
in interface Parser
parse
in class NumberVectorLabelParser<SparseFloatVector>
in
- the stream to parse objects fromprotected VectorFieldTypeInformation<SparseFloatVector> getTypeInformation(int dimensionality)
NumberVectorLabelParser
getTypeInformation
in class NumberVectorLabelParser<SparseFloatVector>
dimensionality
- Dimensionalityprotected Logging getLogger()
AbstractParser
getLogger
in class AbstractParser