@Title(value="CLIQUE: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications") @Description(value="Grid-based algorithm to identify dense clusters in subspaces of maximum dimensionality.") @Reference(authors="R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan", title="Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications", booktitle="Proc. SIGMOD Conference, Seattle, WA, 1998", url="https://doi.org/10.1145/276304.276314", bibkey="DBLP:conf/sigmod/AgrawalGGR98") public class CLIQUE extends AbstractAlgorithm<Clustering<SubspaceModel>> implements SubspaceClusteringAlgorithm<SubspaceModel>
The implementation consists of two steps:
1. Identification of subspaces that contain clusters
2. Identification of clusters
The third step of the original algorithm (Generation of minimal description for the clusters) is not (yet) implemented.
Note: this is fairly old code, and not well optimized. Do not use this for runtime benchmarking!
Reference:
R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan
Automatic Subspace Clustering of High Dimensional Data for Data Mining
Applications
In Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998.
| Modifier and Type | Class and Description |
|---|---|
static class |
CLIQUE.Parameterizer
Parameterization class.
|
| Modifier and Type | Field and Description |
|---|---|
private static Logging |
LOG
The logger for this class.
|
protected boolean |
prune
Pruning flag.
|
protected double |
tau
Density threshold / selectivity.
|
protected int |
xsi
Number of intervals in each dimension.
|
ALGORITHM_ID| Constructor and Description |
|---|
CLIQUE(int xsi,
double tau,
boolean prune)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
private double[][] |
computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces,
int[] mi,
int[] mp)
The specified sorted list of dense subspaces is divided into the selected
set I and the pruned set P.
|
private int[][] |
computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
The specified sorted list of dense subspaces is divided into the selected
set I and the pruned set P.
|
private java.util.List<Pair<Subspace,ModifiableDBIDs>> |
determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
Determines the clusters in the specified dense subspaces.
|
private java.util.List<CLIQUESubspace> |
findDenseSubspaceCandidates(Relation<? extends NumberVector> database,
java.util.List<CLIQUESubspace> denseSubspaces)
Determines the
k-dimensional dense subspace candidates from the
specified (k-1)-dimensional dense subspaces. |
private java.util.List<CLIQUESubspace> |
findDenseSubspaces(Relation<? extends NumberVector> database,
java.util.List<CLIQUESubspace> denseSubspaces)
Determines the
k-dimensional dense subspaces and performs a pruning
if this option is chosen. |
private java.util.List<CLIQUESubspace> |
findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
Determines the one-dimensional dense subspace candidates by making a pass
over the database.
|
private java.util.List<CLIQUESubspace> |
findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
Determines the one dimensional dense subspaces and performs a pruning if
this option is chosen.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private java.util.Collection<CLIQUEUnit> |
initOneDimensionalUnits(Relation<? extends NumberVector> database)
Initializes and returns the one dimensional units.
|
private static double |
log2OrZero(double x)
Robust log 2, that ignores zero values.
|
private java.util.List<CLIQUESubspace> |
pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
Performs a MDL-based pruning of the specified dense subspaces as described
in the CLIQUE algorithm.
|
Clustering<SubspaceModel> |
run(Relation<? extends NumberVector> relation)
Performs the CLIQUE algorithm on the given database.
|
private void |
updateMinMax(NumberVector featureVector,
double[] minima,
double[] maxima)
Updates the minima and maxima array according to the specified feature
vector.
|
runclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
protected int xsi
protected double tau
protected boolean prune
public CLIQUE(int xsi,
double tau,
boolean prune)
xsi - Xsi interval numbertau - Tau density thresholdprune - Prune flagpublic Clustering<SubspaceModel> run(Relation<? extends NumberVector> relation)
relation - Data relation to processprivate java.util.List<Pair<Subspace,ModifiableDBIDs>> determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
denseSubspaces - the dense subspaces in reverse order by their
coverageprivate java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
database - the database to run the algorithm onprivate java.util.List<CLIQUESubspace> findDenseSubspaces(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
k-dimensional dense subspaces and performs a pruning
if this option is chosen.database - the database to run the algorithm ondenseSubspaces - the (k-1)-dimensional dense subspacesk-dimensional dense subspaces sorted in
reverse order by their coverageprivate java.util.Collection<CLIQUEUnit> initOneDimensionalUnits(Relation<? extends NumberVector> database)
database - the database to run the algorithm onprivate void updateMinMax(NumberVector featureVector, double[] minima, double[] maxima)
featureVector - the feature vectorminima - the array of minimamaxima - the array of maximaprivate java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
database - the database to run the algorithm onprivate java.util.List<CLIQUESubspace> findDenseSubspaceCandidates(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
k-dimensional dense subspace candidates from the
specified (k-1)-dimensional dense subspaces.database - the database to run the algorithm ondenseSubspaces - the (k-1)-dimensional dense subspacesk-dimensional dense subspace candidates
reverse ordered by their coverageprivate java.util.List<CLIQUESubspace> pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
denseSubspaces - the subspaces to be pruned sorted in reverse order by
their coverageprivate int[][] computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
denseSubspaces - the dense subspaces in reverse order by their
coverageprivate double[][] computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces, int[] mi, int[] mp)
denseSubspaces - denseSubspaces the dense subspaces in reverse order
by their coveragemi - the mean of the selected sets Imp - the mean of the pruned sets Pprivate static double log2OrZero(double x)
x - Input valuepublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithmgetInputTypeRestriction in interface AlgorithmgetInputTypeRestriction in class AbstractAlgorithm<Clustering<SubspaceModel>>protected Logging getLogger()
AbstractAlgorithmgetLogger in class AbstractAlgorithm<Clustering<SubspaceModel>>Copyright © 2019 ELKI Development Team. License information.