V
- the type of NumberVector used@Description(value="This parser expects data in roughly the same format as the NumberVectorLabelParser,\nexcept that it will enumerate all unique strings to always produce numerical values.\nThis way, it can for example handle files that contain lines like \'y,n,y,y,n,y,n\'.") public class CategorialDataAsNumberVectorParser<V extends NumberVector<?>> extends NumberVectorLabelParser<V>
Modifier and Type | Class and Description |
---|---|
static class |
CategorialDataAsNumberVectorParser.Parameterizer<V extends NumberVector<?>>
Parameterization class.
|
BundleStreamSource.Event
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
Logging class.
|
(package private) Pattern |
nanpattern
Pattern for NaN values.
|
(package private) gnu.trove.map.hash.TObjectIntHashMap<String> |
unique
For String unification.
|
(package private) int |
ustart
Base for enumerating unique values.
|
attributes, columnnames, curlbl, curvec, factory, haslabels, labelcolumns, labelIndices, labels, lineNumber, maxdim, meta, mindim, nextevent
ATTRIBUTE_CONCATENATION, comment, COMMENT_PATTERN, DEFAULT_SEPARATOR, NUMBER_PATTERN, QUOTE_CHARS, tokenizer
Constructor and Description |
---|
CategorialDataAsNumberVectorParser(NumberVector.Factory<V,?> factory)
Constructor with defaults.
|
CategorialDataAsNumberVectorParser(Pattern colSep,
String quoteChars,
Pattern comment,
BitSet labelIndices,
NumberVector.Factory<V,?> factory)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected Logging |
getLogger()
Get the logger for this class.
|
BundleStreamSource.Event |
nextEvent()
Get the next event
|
protected void |
parseLineInternal(String line)
Internal method for parsing a single line.
|
buildMeta, createDBObject, data, getMeta, getTypeInformation, initStream
parse
lengthWithoutLinefeed, toString
private static final Logging LOG
gnu.trove.map.hash.TObjectIntHashMap<String> unique
int ustart
Pattern nanpattern
public CategorialDataAsNumberVectorParser(NumberVector.Factory<V,?> factory)
factory
- Vector factorypublic CategorialDataAsNumberVectorParser(Pattern colSep, String quoteChars, Pattern comment, BitSet labelIndices, NumberVector.Factory<V,?> factory)
colSep
- Column separatorquoteChars
- Quote charactercomment
- Comment patternlabelIndices
- Column indexes that are numeric.factory
- Vector factorypublic BundleStreamSource.Event nextEvent()
BundleStreamSource
nextEvent
in interface BundleStreamSource
nextEvent
in class NumberVectorLabelParser<V extends NumberVector<?>>
protected void parseLineInternal(String line)
NumberVectorLabelParser
parseLineInternal
in class NumberVectorLabelParser<V extends NumberVector<?>>
line
- Line to processprotected Logging getLogger()
AbstractParser
getLogger
in class NumberVectorLabelParser<V extends NumberVector<?>>