|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object de.lmu.ifi.dbs.elki.datasource.parser.AbstractParser de.lmu.ifi.dbs.elki.datasource.parser.NumberVectorLabelParser<SparseFloatVector> de.lmu.ifi.dbs.elki.datasource.parser.SparseFloatVectorLabelParser
@Title(value="Sparse Float Vector Label Parser") @Description(value="Parser for the following line format:\nA single line provides a single point. Entries are separated by whitespace. The values will be parsed as floats (resulting in a set of SparseFloatVectors). A line is expected in the following format: The first entry of each line is the number of attributes with coordinate value not zero. Subsequent entries are of the form (index, value), where index is the number of the corresponding dimension, and value is the value of the corresponding attribute.Any pair of two subsequent substrings not containing whitespace is tried to be read as int and float. If this fails for the first of the pair (interpreted ans index), it will be appended to a label. (Thus, any label must not be parseable as Integer.) If the float component is not parseable, an exception will be thrown. Empty lines and lines beginning with \"#\" will be ignored. Having the file parsed completely, the maximum occuring dimensionality is set as dimensionality to all created SparseFloatvectors.") public class SparseFloatVectorLabelParser
Provides a parser for parsing one point per line, attributes separated by whitespace.
Several labels may be given per point. A label must not be parseable as double. Lines starting with "#" will be ignored.
A line is expected in the following format: The first entry of each line is the number of attributes with coordinate value not zero. Subsequent entries are of the form (index, value), where index is the number of the corresponding dimension, and value is the value of the corresponding attribute.
An index can be specified to identify an entry to be treated as class label. This index counts all entries (numeric and labels as well) starting with 0.
Nested Class Summary | |
---|---|
static class |
SparseFloatVectorLabelParser.Parameterizer
Parameterization class. |
Field Summary | |
---|---|
private int |
dimensionality
Holds the dimensionality of the parsed data which is the maximum occurring index of any attribute. |
private static Logging |
logger
Class logger |
Fields inherited from class de.lmu.ifi.dbs.elki.datasource.parser.NumberVectorLabelParser |
---|
LABEL_INDICES_ID, labelIndices |
Fields inherited from class de.lmu.ifi.dbs.elki.datasource.parser.AbstractParser |
---|
ATTRIBUTE_CONCATENATION, COLUMN_SEPARATOR_ID, COMMENT, NUMBER_PATTERN, QUOTE_CHAR, QUOTE_ID, quoteChar, WHITESPACE_PATTERN |
Constructor Summary | |
---|---|
SparseFloatVectorLabelParser(Pattern colSep,
char quoteChar,
BitSet labelIndices)
Constructor. |
Method Summary | |
---|---|
SparseFloatVector |
createDBObject(List<Double> attributes)
Creates a database object of type V. |
protected Logging |
getLogger()
Get the logger for this class. |
protected VectorFieldTypeInformation<SparseFloatVector> |
getTypeInformation(int dimensionality)
Get a prototype object for the given dimensionality. |
MultipleObjectsBundle |
parse(InputStream in)
Returns a list of the objects parsed from the specified input stream. |
Pair<SparseFloatVector,LabelList> |
parseLineInternal(String line)
Internal method for parsing a single line. |
Methods inherited from class de.lmu.ifi.dbs.elki.datasource.parser.NumberVectorLabelParser |
---|
parseLine |
Methods inherited from class de.lmu.ifi.dbs.elki.datasource.parser.AbstractParser |
---|
tokenize, toString |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
private static final Logging logger
private int dimensionality
Constructor Detail |
---|
public SparseFloatVectorLabelParser(Pattern colSep, char quoteChar, BitSet labelIndices)
colSep
- quoteChar
- labelIndices
- Method Detail |
---|
public SparseFloatVector createDBObject(List<Double> attributes)
NumberVectorLabelParser
Creates a database object of type V.
createDBObject
in class NumberVectorLabelParser<SparseFloatVector>
attributes
- the attributes of the vector to create.
public Pair<SparseFloatVector,LabelList> parseLineInternal(String line)
NumberVectorLabelParser
parseLineInternal
in class NumberVectorLabelParser<SparseFloatVector>
line
- Line to process
public MultipleObjectsBundle parse(InputStream in)
Parser
parse
in interface Parser
parse
in class NumberVectorLabelParser<SparseFloatVector>
in
- the stream to parse objects from
NumberVectorLabelParser.parse(java.io.InputStream)
protected VectorFieldTypeInformation<SparseFloatVector> getTypeInformation(int dimensionality)
NumberVectorLabelParser
getTypeInformation
in class NumberVectorLabelParser<SparseFloatVector>
dimensionality
- Dimensionality
protected Logging getLogger()
AbstractParser
getLogger
in class AbstractParser
|
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |