See: Description
Interface | Description |
---|---|
DistanceParser<D extends Distance<D>> |
A DistanceParser shall provide a DistanceParsingResult by parsing an InputStream.
|
LinebasedParser |
A parser that can parse single line.
|
Parser |
A Parser shall provide a ParsingResult by parsing an InputStream.
|
Class | Description |
---|---|
AbstractParser |
Abstract superclass for all parsers providing the option handler for handling
options.
|
AbstractParser.Parameterizer |
Parameterization class.
|
ArffParser |
Parser to load WEKA .arff files into ELKI.
|
ArffParser.Parameterizer |
Parameterization class.
|
BitVectorLabelParser |
Provides a parser for parsing one BitVector per line, bits separated by
whitespace.
|
BitVectorLabelParser.Parameterizer |
Parameterization class.
|
DistanceParsingResult<D extends Distance<D>> |
Provides a list of database objects and labels associated with these objects
and a cache of precomputed distances between the database objects.
|
DoubleVectorLabelParser |
Provides a parser for parsing one point per line, attributes separated by
whitespace.
|
DoubleVectorLabelParser.Parameterizer |
Parameterization class.
|
DoubleVectorLabelTransposingParser |
Parser reads points transposed.
|
DoubleVectorLabelTransposingParser.Parameterizer |
Parameterization class.
|
FloatVectorLabelParser |
Provides a parser for parsing one point per line, attributes separated by
whitespace.
|
FloatVectorLabelParser.Parameterizer |
Parameterization class.
|
NumberDistanceParser<D extends NumberDistance<D,N>,N extends Number> |
Provides a parser for parsing one distance value per line.
|
NumberDistanceParser.Parameterizer<D extends NumberDistance<D,N>,N extends Number> |
Parameterization class.
|
NumberVectorLabelParser<V extends NumberVector<?,?>> |
Provides a parser for parsing one point per line, attributes separated by
whitespace.
|
NumberVectorLabelParser.Parameterizer<V extends NumberVector<?,?>> |
Parameterization class.
|
ParameterizationFunctionLabelParser |
Provides a parser for parsing one point per line, attributes separated by
whitespace.
|
ParameterizationFunctionLabelParser.Parameterizer |
Parameterization class.
|
SimplePolygonParser |
Parser to load polygon data (2D and 3D only) from a simple format.
|
SimplePolygonParser.Parameterizer |
Parameterization class.
|
SparseBitVectorLabelParser |
Provides a parser for parsing one sparse BitVector per line, where the
indices of the one-bits are separated by whitespace.
|
SparseBitVectorLabelParser.Parameterizer |
Parameterization class.
|
SparseFloatVectorLabelParser |
Provides a parser for parsing one point per line, attributes separated by
whitespace.
|
SparseFloatVectorLabelParser.Parameterizer |
Parameterization class.
|
TermFrequencyParser |
A parser to load term frequency data, which essentially are sparse vectors
with text keys.
|
TermFrequencyParser.Parameterizer |
Parameterization class.
|
Parsers for different file formats and data types.
The general use-case for any parser is to create objects out of an
InputStream
(e.g. by reading a data file).
The objects are packed in a
MultipleObjectsBundle
which,
in turn, is used by a DatabaseConnection
-Object
to fill a Database
containing the corresponding objects.
By default (i.e., if the user does not specify any specific requests),
any KDDTask
will
use the StaticArrayDatabase
which,
in turn, will use a FileBasedDatabaseConnection
and a DoubleVectorLabelParser
to parse a specified data file creating
a StaticArrayDatabase
containing DoubleVector
-Objects.
Thus, the standard procedure to use a data set of a real-valued vector space
is to prepare the data set in a file of the following format
(as suitable to DoubleVectorLabelParser
):
As an example file following these requirements consider e.g.: exampledata.txt