de.lmu.ifi.dbs.elki.datasource.parser
Class AbstractParser

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.datasource.parser.AbstractParser
Direct Known Subclasses:
BitVectorLabelParser, NumberDistanceParser, NumberVectorLabelParser, ParameterizationFunctionLabelParser, SimplePolygonParser, SparseBitVectorLabelParser

public abstract class AbstractParser
extends Object

Abstract superclass for all parsers providing the option handler for handling options.


Nested Class Summary
static class AbstractParser.Parameterizer
          Parameterization class.
 
Field Summary
static String ATTRIBUTE_CONCATENATION
          A sign to separate attributes.
private  Pattern colSep
          Stores the column separator pattern
static OptionID COLUMN_SEPARATOR_ID
          OptionID for the column separator parameter (defaults to whitespace as in WHITESPACE_PATTERN.
static String COMMENT
          The comment character.
static String NUMBER_PATTERN
          A pattern catching most numbers that can be parsed using Double.valueOf: Some examples: 1 1.
static String QUOTE_CHAR
          A quote pattern
static OptionID QUOTE_ID
          OptionID for the quote character parameter (defaults to a double quotation mark as in QUOTE_CHAR.
protected  char quoteChar
          Stores the quotation character
static String WHITESPACE_PATTERN
          A pattern defining whitespace.
 
Constructor Summary
AbstractParser(Pattern colSep, char quoteChar)
          Constructor.
 
Method Summary
protected abstract  Logging getLogger()
          Get the logger for this class.
protected  List<String> tokenize(String input)
          Tokenize a string.
 String toString()
          Returns a string representation of the object.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

WHITESPACE_PATTERN

public static final String WHITESPACE_PATTERN
A pattern defining whitespace.

See Also:
Constant Field Values

QUOTE_CHAR

public static final String QUOTE_CHAR
A quote pattern

See Also:
Constant Field Values

NUMBER_PATTERN

public static final String NUMBER_PATTERN
A pattern catching most numbers that can be parsed using Double.valueOf: Some examples: 1 1. 1.2 .2 -.2e-03

See Also:
Constant Field Values

COLUMN_SEPARATOR_ID

public static final OptionID COLUMN_SEPARATOR_ID
OptionID for the column separator parameter (defaults to whitespace as in WHITESPACE_PATTERN.


QUOTE_ID

public static final OptionID QUOTE_ID
OptionID for the quote character parameter (defaults to a double quotation mark as in QUOTE_CHAR.


colSep

private Pattern colSep
Stores the column separator pattern


quoteChar

protected char quoteChar
Stores the quotation character


COMMENT

public static final String COMMENT
The comment character.

See Also:
Constant Field Values

ATTRIBUTE_CONCATENATION

public static final String ATTRIBUTE_CONCATENATION
A sign to separate attributes.

See Also:
Constant Field Values
Constructor Detail

AbstractParser

public AbstractParser(Pattern colSep,
                      char quoteChar)
Constructor.

Parameters:
colSep - Column separator
quoteChar - Quote character
Method Detail

tokenize

protected List<String> tokenize(String input)
Tokenize a string. Works much like colSep.split() except it honors quotation characters.

Parameters:
input - Input string
Returns:
Tokenized string

getLogger

protected abstract Logging getLogger()
Get the logger for this class.

Returns:
Logger.

toString

public String toString()
Returns a string representation of the object.

Overrides:
toString in class Object
Returns:
a string representation of the object.

Release 0.4.0 (2011-09-20_1324)