de.lmu.ifi.dbs.elki.algorithm.clustering.trivial
Class ByLabelClustering

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<Clustering<Model>>
      extended by de.lmu.ifi.dbs.elki.algorithm.clustering.trivial.ByLabelClustering
All Implemented Interfaces:
Algorithm, ClusteringAlgorithm<Clustering<Model>>, InspectionUtilFrequentlyScanned, Parameterizable

@Title(value="Clustering by label")
@Description(value="Cluster points by a (pre-assigned!) label. For comparing results with a reference clustering.")
public class ByLabelClustering
extends AbstractAlgorithm<Clustering<Model>>
implements ClusteringAlgorithm<Clustering<Model>>

Pseudo clustering using labels. This "algorithm" puts elements into the same cluster when they agree in their labels. I.e. it just uses a predefined clustering, and is mostly useful for testing and evaluation (e.g. comparing the result of a real algorithm to a reference result / golden standard). If an assignment of an object to multiple clusters is desired, the labels of the object indicating the clusters need to be separated by blanks and the flag MULTIPLE_ID needs to be set. TODO: handling of data sets with no labels?


Nested Class Summary
static class ByLabelClustering.Parameterizer
          Parameterization class.
 
Field Summary
private static Logging logger
          The logger for this class.
private  boolean multiple
          Holds the value of MULTIPLE_ID.
static OptionID MULTIPLE_ID
          Flag to indicate that multiple cluster assignment is possible.
static OptionID NOISE_ID
          Flag to indicate that multiple cluster assignment is possible.
private  Pattern noisepattern
          Holds the value of NOISE_ID.
 
Constructor Summary
ByLabelClustering()
          Constructor without parameters
ByLabelClustering(boolean multiple, Pattern noisepattern)
          Constructor.
 
Method Summary
private  void assign(HashMap<String,ModifiableDBIDs> labelMap, String label, DBID id)
          Assigns the specified id to the labelMap according to its label
 TypeInformation[] getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
protected  Logging getLogger()
          Get the (STATIC) logger for this class.
private  HashMap<String,ModifiableDBIDs> multipleAssignment(Relation<?> data)
          Assigns the objects of the database to multiple clusters according to their labels.
 Clustering<Model> run(Database database)
          Runs the algorithm.
 Clustering<Model> run(Relation<?> relation)
          Run the actual clustering algorithm.
private  HashMap<String,ModifiableDBIDs> singleAssignment(Relation<?> data)
          Assigns the objects of the database to single clusters according to their labels.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
makeParameterDistanceFunction
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

logger

private static final Logging logger
The logger for this class.


MULTIPLE_ID

public static final OptionID MULTIPLE_ID
Flag to indicate that multiple cluster assignment is possible. If an assignment to multiple clusters is desired, the labels indicating the clusters need to be separated by blanks.


NOISE_ID

public static final OptionID NOISE_ID
Flag to indicate that multiple cluster assignment is possible. If an assignment to multiple clusters is desired, the labels indicating the clusters need to be separated by blanks.


multiple

private boolean multiple
Holds the value of MULTIPLE_ID.


noisepattern

private Pattern noisepattern
Holds the value of NOISE_ID.

Constructor Detail

ByLabelClustering

public ByLabelClustering(boolean multiple,
                         Pattern noisepattern)
Constructor.

Parameters:
multiple - Allow multiple cluster assignments
noisepattern - Noise pattern

ByLabelClustering

public ByLabelClustering()
Constructor without parameters

Method Detail

run

public Clustering<Model> run(Database database)
Description copied from interface: Algorithm
Runs the algorithm.

Specified by:
run in interface Algorithm
Specified by:
run in interface ClusteringAlgorithm<Clustering<Model>>
Overrides:
run in class AbstractAlgorithm<Clustering<Model>>
Parameters:
database - the database to run the algorithm on
Returns:
the Result computed by this algorithm

run

public Clustering<Model> run(Relation<?> relation)
Run the actual clustering algorithm.

Parameters:
relation - The data input we use

singleAssignment

private HashMap<String,ModifiableDBIDs> singleAssignment(Relation<?> data)
Assigns the objects of the database to single clusters according to their labels.

Parameters:
data - the database storing the objects
Returns:
a mapping of labels to ids

multipleAssignment

private HashMap<String,ModifiableDBIDs> multipleAssignment(Relation<?> data)
Assigns the objects of the database to multiple clusters according to their labels.

Parameters:
data - the database storing the objects
Returns:
a mapping of labels to ids

assign

private void assign(HashMap<String,ModifiableDBIDs> labelMap,
                    String label,
                    DBID id)
Assigns the specified id to the labelMap according to its label

Parameters:
labelMap - the mapping of label to ids
label - the label of the object to be assigned
id - the id of the object to be assigned

getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<Clustering<Model>>
Returns:
Type restriction

getLogger

protected Logging getLogger()
Description copied from class: AbstractAlgorithm
Get the (STATIC) logger for this class.

Specified by:
getLogger in class AbstractAlgorithm<Clustering<Model>>
Returns:
the static logger

Release 0.4.0 (2011-09-20_1324)