
V - the type of NumberVector usedpublic class NumberVectorLabelParser<V extends NumberVector<?>> extends AbstractStreamingParser
Provides a parser for parsing one point per line, attributes separated by whitespace.
Several labels may be given per point. A label must not be parseable as double. Lines starting with "#" will be ignored.
An index can be specified to identify an entry to be treated as class label. This index counts all entries (numeric and labels as well) starting with 0.
| Modifier and Type | Class and Description | 
|---|---|
| static class  | NumberVectorLabelParser.Parameterizer<V extends NumberVector<?>>Parameterization class. | 
BundleStreamSource.Event| Modifier and Type | Field and Description | 
|---|---|
| protected List<String> | columnnamesColumn names. | 
| protected LabelList | curlblCurrent labels. | 
| protected V | curvecCurrent vector. | 
| protected int | dimensionalityDimensionality reported. | 
| static int | DIMENSIONALITY_UNKNOWNConstant used for unknown dimensionality (e.g. empty files) | 
| static int | DIMENSIONALITY_VARIABLEConstant used for records of variable dimensionality (e.g. time series) | 
| protected NumberVector.Factory<V,?> | factoryVector factory class. | 
| static OptionID | LABEL_INDICES_IDA comma separated list of the indices of labels (may be numeric), counting
 whitespace separated entries in a line starting with 0. | 
| protected BitSet | labelcolumnsBitset to indicate which columns are not numeric. | 
| protected BitSet | labelIndicesKeeps the indices of the attributes to be treated as a string label. | 
| protected int | lineNumberCurrent line number. | 
| private static Logging | LOGLogging class. | 
| protected BundleMeta | metaMetadata. | 
| (package private) BundleStreamSource.Event | nexteventEvent to report next. | 
| private BufferedReader | readerBuffer reader. | 
| static OptionID | VECTOR_TYPE_IDParameter to specify the type of vectors to produce. | 
ATTRIBUTE_CONCATENATION, COLUMN_SEPARATOR_ID, COMMENT, DEFAULT_SEPARATOR, NUMBER_PATTERN, QUOTE_CHAR, QUOTE_ID, quoteChar| Constructor and Description | 
|---|
| NumberVectorLabelParser(NumberVector.Factory<V,?> factory)Constructor with defaults. | 
| NumberVectorLabelParser(Pattern colSep,
                       char quoteChar,
                       BitSet labelIndices,
                       NumberVector.Factory<V,?> factory)Constructor. | 
| Modifier and Type | Method and Description | 
|---|---|
| protected void | buildMeta()Update the meta element. | 
| protected <A> V | createDBObject(A attributes,
              NumberArrayAdapter<?,A> adapter)Creates a database object of type V. | 
| Object | data(int rnum)Access a particular object and representation. | 
| protected Logging | getLogger()Get the logger for this class. | 
| BundleMeta | getMeta()Get the current meta data. | 
| (package private) SimpleTypeInformation<V> | getTypeInformation(int dimensionality)Get a prototype object for the given dimensionality. | 
| void | initStream(InputStream in)Init the streaming parser for the given input stream. | 
| BundleStreamSource.Event | nextEvent()Get the next event | 
| protected void | parseLineInternal(String line)Internal method for parsing a single line. | 
parsetokenize, toStringprivate static final Logging LOG
public static final OptionID LABEL_INDICES_ID
 Key: -parser.labelIndices
 
public static final OptionID VECTOR_TYPE_ID
 Key: -parser.vector-type
 Default: DoubleVector
 
public static final int DIMENSIONALITY_UNKNOWN
public static final int DIMENSIONALITY_VARIABLE
protected BitSet labelIndices
protected NumberVector.Factory<V extends NumberVector<?>,?> factory
private BufferedReader reader
protected int lineNumber
protected int dimensionality
protected BundleMeta meta
protected BitSet labelcolumns
protected V extends NumberVector<?> curvec
protected LabelList curlbl
BundleStreamSource.Event nextevent
public NumberVectorLabelParser(NumberVector.Factory<V,?> factory)
factory - Vector factorypublic NumberVectorLabelParser(Pattern colSep, char quoteChar, BitSet labelIndices, NumberVector.Factory<V,?> factory)
colSep - Column separatorquoteChar - Quote characterlabelIndices - Column indexes that are numeric.factory - Vector factorypublic void initStream(InputStream in)
StreamingParserin - the stream to parse objects frompublic BundleMeta getMeta()
BundleStreamSourcepublic BundleStreamSource.Event nextEvent()
BundleStreamSourceprotected void buildMeta()
public Object data(int rnum)
BundleStreamSourcernum - Representation numberprotected void parseLineInternal(String line)
line - Line to processprotected <A> V createDBObject(A attributes, NumberArrayAdapter<?,A> adapter)
A - attribute typeattributes - the attributes of the vector to create.adapter - Array adapterSimpleTypeInformation<V> getTypeInformation(int dimensionality)
dimensionality - Dimensionalityprotected Logging getLogger()
AbstractParsergetLogger in class AbstractParser