@Reference(authors="M. Chau, R. Cheng, B. Kao, J. Ng", title="Uncertain data mining: An example in clustering location data", booktitle="Proc. 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2006)", url="https://doi.org/10.1007/11731139_24", bibkey="DBLP:conf/pakdd/ChauCKN06") public class UKMeans extends AbstractAlgorithm<Clustering<KMeansModel>> implements ClusteringAlgorithm<Clustering<KMeansModel>>
Note: this method is, essentially, superficial. It was shown to be equivalent
to doing regular K-means on the object centroids instead (see CKMeans
for the reference and an implementation). This is only for completeness.
Reference:
M. Chau, R. Cheng, B. Kao, J. Ng
Uncertain data mining: An example in clustering location data
Proc. 10th Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD)
Modifier and Type | Class and Description |
---|---|
static class |
UKMeans.Parameterizer
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
protected int |
k
Number of cluster centers to initialize.
|
protected static java.lang.String |
KEY
Key for statistics logging.
|
protected static Logging |
LOG
CLass logger.
|
protected int |
maxiter
Maximum number of iterations
|
protected RandomFactory |
rnd
Our Random factory
|
ALGORITHM_ID
Constructor and Description |
---|
UKMeans(int k,
int maxiter,
RandomFactory rnd)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
assignToNearestCluster(Relation<DiscreteUncertainObject> relation,
java.util.List<double[]> means,
java.util.List<? extends ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment,
double[] varsum)
Returns a list of clusters.
|
protected double |
getExpectedRepDistance(NumberVector rep,
DiscreteUncertainObject uo)
Get expected distance between a Vector and an uncertain object
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
protected void |
logVarstat(DoubleStatistic varstat,
double[] varsum)
Log statistics on the variance sum.
|
protected java.util.List<double[]> |
means(java.util.List<? extends ModifiableDBIDs> clusters,
java.util.List<double[]> means,
Relation<DiscreteUncertainObject> database)
Returns the mean vectors of the given clusters in the given database.
|
Clustering<?> |
run(Database database,
Relation<DiscreteUncertainObject> relation)
Run the clustering.
|
protected boolean |
updateAssignment(DBIDIter iditer,
java.util.List<? extends ModifiableDBIDs> clusters,
WritableIntegerDataStore assignment,
int newA)
Update the cluster assignment.
|
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
protected static final Logging LOG
protected static final java.lang.String KEY
protected int k
protected int maxiter
protected RandomFactory rnd
public UKMeans(int k, int maxiter, RandomFactory rnd)
k
- Number of clustersmaxiter
- Maximum number of iterationsrnd
- Random initializationpublic Clustering<?> run(Database database, Relation<DiscreteUncertainObject> relation)
database
- the Databaserelation
- the Relationprotected boolean assignToNearestCluster(Relation<DiscreteUncertainObject> relation, java.util.List<double[]> means, java.util.List<? extends ModifiableDBIDs> clusters, WritableIntegerDataStore assignment, double[] varsum)
relation
- the database to clustermeans
- a list of k meansclusters
- cluster assignmentassignment
- Current cluster assignmentvarsum
- Variance sum outputprotected boolean updateAssignment(DBIDIter iditer, java.util.List<? extends ModifiableDBIDs> clusters, WritableIntegerDataStore assignment, int newA)
iditer
- Object idclusters
- Cluster listassignment
- Assignment storagenewA
- New assignment.true
if the assignment has changed.protected double getExpectedRepDistance(NumberVector rep, DiscreteUncertainObject uo)
rep
- A vector, e.g. a cluster representativeuo
- A discrete uncertain objectprotected java.util.List<double[]> means(java.util.List<? extends ModifiableDBIDs> clusters, java.util.List<double[]> means, Relation<DiscreteUncertainObject> database)
clusters
- the clusters to compute the meansmeans
- the recent meansdatabase
- the database containing the vectorspublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<KMeansModel>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<KMeansModel>>
protected void logVarstat(DoubleStatistic varstat, double[] varsum)
varstat
- Statistics log instancevarsum
- Variance sum per clusterCopyright © 2019 ELKI Development Team. License information.