@Reference(authors="Andreas Z\u00fcfle, Tobias Emrich, Klaus Arthur Schmid, Nikos Mamoulis, Arthur Zimek, Mathias Renz", title="Representative clustering of uncertain data", booktitle="Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining", url="https://doi.org/10.1145/2623330.2623725", bibkey="DBLP:conf/kdd/ZufleESMZR14") public class RepresentativeUncertainClustering extends AbstractAlgorithm<Clustering<Model>> implements ClusteringAlgorithm<Clustering<Model>>
This algorithm clusters uncertain data by repeatedly sampling a possible world, then running a traditional clustering algorithm on this sample.
The resulting "possible" clusterings are then clustered themselves, using a clustering similarity measure. This yields a number of representatives for the set of all possible worlds.
Reference:
Andreas Züfle, Tobias Emrich, Klaus Arthur Schmid, Nikos Mamoulis,
Arthur Zimek, Mathias Renz
Representative clustering of uncertain data
In Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining
Modifier and Type | Class and Description |
---|---|
static class |
RepresentativeUncertainClustering.Parameterizer
Parameterization class.
|
static class |
RepresentativeUncertainClustering.RepresentativenessEvaluation
Representativeness evaluation result.
|
Modifier and Type | Field and Description |
---|---|
protected double |
alpha
Alpha parameter for confidence.
|
protected ClusteringDistanceSimilarityFunction |
distance
Distance function for clusterings.
|
protected boolean |
keep
Keep all samples (not only the representative results)
|
private static Logging |
LOG
Initialize a Logger.
|
protected ClusteringAlgorithm<?> |
metaAlgorithm
The algorithm for meta-clustering.
|
protected int |
numsamples
How many clusterings shall be made for aggregation.
|
protected RandomFactory |
random
Random factory for sampling.
|
protected ClusteringAlgorithm<?> |
samplesAlgorithm
The algorithm to be wrapped and run.
|
ALGORITHM_ID
Constructor and Description |
---|
RepresentativeUncertainClustering(ClusteringDistanceSimilarityFunction distance,
ClusteringAlgorithm<?> metaAlgorithm,
ClusteringAlgorithm<?> samplesAlgorithm,
int numsamples,
RandomFactory random,
double alpha,
boolean keep)
Constructor, quite trivial.
|
Modifier and Type | Method and Description |
---|---|
private double |
computeConfidence(int support,
int samples)
Estimate the confidence probability of a clustering.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
Clustering<?> |
run(Database database,
Relation<? extends UncertainObject> relation)
This run method will do the wrapping.
|
protected Clustering<?> |
runClusteringAlgorithm(ResultHierarchy hierarchy,
Result parent,
DBIDs ids,
DataStore<DoubleVector> store,
int dim,
java.lang.String title)
Run a clustering algorithm on a single instance.
|
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
protected ClusteringDistanceSimilarityFunction distance
protected ClusteringAlgorithm<?> metaAlgorithm
protected ClusteringAlgorithm<?> samplesAlgorithm
protected int numsamples
protected RandomFactory random
protected double alpha
protected boolean keep
public RepresentativeUncertainClustering(ClusteringDistanceSimilarityFunction distance, ClusteringAlgorithm<?> metaAlgorithm, ClusteringAlgorithm<?> samplesAlgorithm, int numsamples, RandomFactory random, double alpha, boolean keep)
distance
- Distance function for meta clusteringmetaAlgorithm
- Meta clustering algorithmsamplesAlgorithm
- Primary clustering algorithmnumsamples
- Number of samplesalpha
- Alpha confidencekeep
- Keep all samples (not only the representative results).public Clustering<?> run(Database database, Relation<? extends UncertainObject> relation)
AbstractAlgorithm.run(Database)
and performs the
call to the algorithms particular run method as well as the storing and
comparison of the resulting Clusterings.database
- Databaserelation
- Data relation of uncertain objectsprivate double computeConfidence(int support, int samples)
support
- Number of supporting samplessamples
- Total samplesprotected Clustering<?> runClusteringAlgorithm(ResultHierarchy hierarchy, Result parent, DBIDs ids, DataStore<DoubleVector> store, int dim, java.lang.String title)
parent
- Parent result to attach toids
- Object IDs to processstore
- Input datadim
- Dimensionalitytitle
- Title of relationpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<Model>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<Model>>
Copyright © 2019 ELKI Development Team. License information.