@Title(value="Clustering by label") @Description(value="Cluster points by a (pre-assigned!) label. For comparing results with a reference clustering.") public class ByLabelClustering extends AbstractAlgorithm<Clustering<Model>> implements ClusteringAlgorithm<Clustering<Model>>
MULTIPLE_ID
needs to be set.
TODO: handling of data sets with no labels?Modifier and Type | Class and Description |
---|---|
static class |
ByLabelClustering.Parameterizer
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
The logger for this class.
|
private boolean |
multiple
Holds the value of
MULTIPLE_ID . |
static OptionID |
MULTIPLE_ID
Flag to indicate that multiple cluster assignment is possible.
|
static OptionID |
NOISE_ID
Pattern to recognize noise clusters by.
|
private Pattern |
noisepattern
Holds the value of
NOISE_ID . |
Constructor and Description |
---|
ByLabelClustering()
Constructor without parameters
|
ByLabelClustering(boolean multiple,
Pattern noisepattern)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private void |
assign(HashMap<String,DBIDs> labelMap,
String label,
DBIDRef id)
Assigns the specified id to the labelMap according to its label
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private HashMap<String,DBIDs> |
multipleAssignment(Relation<?> data)
Assigns the objects of the database to multiple clusters according to their
labels.
|
Clustering<Model> |
run(Database database)
Runs the algorithm.
|
Clustering<Model> |
run(Relation<?> relation)
Run the actual clustering algorithm.
|
private HashMap<String,DBIDs> |
singleAssignment(Relation<?> data)
Assigns the objects of the database to single clusters according to their
labels.
|
makeParameterDistanceFunction
private static final Logging LOG
public static final OptionID MULTIPLE_ID
public static final OptionID NOISE_ID
private boolean multiple
MULTIPLE_ID
.public ByLabelClustering(boolean multiple, Pattern noisepattern)
multiple
- Allow multiple cluster assignmentsnoisepattern
- Noise patternpublic ByLabelClustering()
public Clustering<Model> run(Database database)
Algorithm
run
in interface Algorithm
run
in interface ClusteringAlgorithm<Clustering<Model>>
run
in class AbstractAlgorithm<Clustering<Model>>
database
- the database to run the algorithm onpublic Clustering<Model> run(Relation<?> relation)
relation
- The data input we useprivate HashMap<String,DBIDs> singleAssignment(Relation<?> data)
data
- the database storing the objectsprivate HashMap<String,DBIDs> multipleAssignment(Relation<?> data)
data
- the database storing the objectsprivate void assign(HashMap<String,DBIDs> labelMap, String label, DBIDRef id)
labelMap
- the mapping of label to idslabel
- the label of the object to be assignedid
- the id of the object to be assignedpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<Model>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<Model>>