V
- the type of NumberVector used@Description(value="This parser expects data in roughly the same format as the NumberVectorLabelParser,\nexcept that it will enumerate all unique strings to always produce numerical values.\nThis way, it can for example handle files that contain lines like \'y,n,y,y,n,y,n\'.") public class CategorialDataAsNumberVectorParser<V extends NumberVector> extends NumberVectorLabelParser<V>
Modifier and Type | Class and Description |
---|---|
static class |
CategorialDataAsNumberVectorParser.Parameterizer<V extends NumberVector>
Parameterization class.
|
BundleStreamSource.Event
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
Logging class.
|
(package private) Matcher |
nanpattern
Pattern for NaN values.
|
(package private) gnu.trove.map.hash.TObjectIntHashMap<String> |
unique
For String unification.
|
(package private) int |
ustart
Base for enumerating unique values.
|
attributes, columnnames, curlbl, curvec, factory, haslabels, labels, maxdim, meta, mindim, nextevent
reader, tokenizer
Constructor and Description |
---|
CategorialDataAsNumberVectorParser(CSVReaderFormat format,
long[] labelIndices,
NumberVector.Factory<V> factory)
Constructor.
|
CategorialDataAsNumberVectorParser(NumberVector.Factory<V> factory)
Constructor with defaults.
|
Modifier and Type | Method and Description |
---|---|
protected Logging |
getLogger()
Get the logger for this class.
|
BundleStreamSource.Event |
nextEvent()
Get the next event
|
protected boolean |
parseLineInternal()
Internal method for parsing a single line.
|
buildMeta, cleanup, createVector, data, getMeta, getTypeInformation, initStream, isLabelColumn
asMultipleObjectsBundle, assignDBID, hasDBIDs, parse
private static final Logging LOG
gnu.trove.map.hash.TObjectIntHashMap<String> unique
int ustart
Matcher nanpattern
public CategorialDataAsNumberVectorParser(NumberVector.Factory<V> factory)
factory
- Vector factorypublic CategorialDataAsNumberVectorParser(CSVReaderFormat format, long[] labelIndices, NumberVector.Factory<V> factory)
format
- Input formatlabelIndices
- Column indexes that are numeric.factory
- Vector factorypublic BundleStreamSource.Event nextEvent()
BundleStreamSource
nextEvent
in interface BundleStreamSource
nextEvent
in class NumberVectorLabelParser<V extends NumberVector>
protected boolean parseLineInternal()
NumberVectorLabelParser
parseLineInternal
in class NumberVectorLabelParser<V extends NumberVector>
true
when a valid line was read, false
on a label
row.protected Logging getLogger()
AbstractStreamingParser
getLogger
in class NumberVectorLabelParser<V extends NumberVector>
Copyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.