@Title(value="CLIQUE: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications") @Description(value="Grid-based algorithm to identify dense clusters in subspaces of maximum dimensionality.") @Reference(authors="R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan", title="Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications", booktitle="Proc. SIGMOD Conference, Seattle, WA, 1998", url="https://doi.org/10.1145/276304.276314", bibkey="DBLP:conf/sigmod/AgrawalGGR98") public class CLIQUE extends AbstractAlgorithm<Clustering<SubspaceModel>> implements SubspaceClusteringAlgorithm<SubspaceModel>
The implementation consists of two steps:
1. Identification of subspaces that contain clusters
2. Identification of clusters
The third step of the original algorithm (Generation of minimal description for the clusters) is not (yet) implemented.
Note: this is fairly old code, and not well optimized. Do not use this for runtime benchmarking!
Reference:
R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan
Automatic Subspace Clustering of High Dimensional Data for Data Mining
Applications
In Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998.
Modifier and Type | Class and Description |
---|---|
static class |
CLIQUE.Parameterizer
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
The logger for this class.
|
protected boolean |
prune
Pruning flag.
|
protected double |
tau
Density threshold / selectivity.
|
protected int |
xsi
Number of intervals in each dimension.
|
ALGORITHM_ID
Constructor and Description |
---|
CLIQUE(int xsi,
double tau,
boolean prune)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private double[][] |
computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces,
int[] mi,
int[] mp)
The specified sorted list of dense subspaces is divided into the selected
set I and the pruned set P.
|
private int[][] |
computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
The specified sorted list of dense subspaces is divided into the selected
set I and the pruned set P.
|
private java.util.List<Pair<Subspace,ModifiableDBIDs>> |
determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
Determines the clusters in the specified dense subspaces.
|
private java.util.List<CLIQUESubspace> |
findDenseSubspaceCandidates(Relation<? extends NumberVector> database,
java.util.List<CLIQUESubspace> denseSubspaces)
Determines the
k -dimensional dense subspace candidates from the
specified (k-1) -dimensional dense subspaces. |
private java.util.List<CLIQUESubspace> |
findDenseSubspaces(Relation<? extends NumberVector> database,
java.util.List<CLIQUESubspace> denseSubspaces)
Determines the
k -dimensional dense subspaces and performs a pruning
if this option is chosen. |
private java.util.List<CLIQUESubspace> |
findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
Determines the one-dimensional dense subspace candidates by making a pass
over the database.
|
private java.util.List<CLIQUESubspace> |
findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
Determines the one dimensional dense subspaces and performs a pruning if
this option is chosen.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private java.util.Collection<CLIQUEUnit> |
initOneDimensionalUnits(Relation<? extends NumberVector> database)
Initializes and returns the one dimensional units.
|
private static double |
log2OrZero(double x)
Robust log 2, that ignores zero values.
|
private java.util.List<CLIQUESubspace> |
pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
Performs a MDL-based pruning of the specified dense subspaces as described
in the CLIQUE algorithm.
|
Clustering<SubspaceModel> |
run(Relation<? extends NumberVector> relation)
Performs the CLIQUE algorithm on the given database.
|
private void |
updateMinMax(NumberVector featureVector,
double[] minima,
double[] maxima)
Updates the minima and maxima array according to the specified feature
vector.
|
run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
protected int xsi
protected double tau
protected boolean prune
public CLIQUE(int xsi, double tau, boolean prune)
xsi
- Xsi interval numbertau
- Tau density thresholdprune
- Prune flagpublic Clustering<SubspaceModel> run(Relation<? extends NumberVector> relation)
relation
- Data relation to processprivate java.util.List<Pair<Subspace,ModifiableDBIDs>> determineClusters(java.util.List<CLIQUESubspace> denseSubspaces)
denseSubspaces
- the dense subspaces in reverse order by their
coverageprivate java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaces(Relation<? extends NumberVector> database)
database
- the database to run the algorithm onprivate java.util.List<CLIQUESubspace> findDenseSubspaces(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
k
-dimensional dense subspaces and performs a pruning
if this option is chosen.database
- the database to run the algorithm ondenseSubspaces
- the (k-1)
-dimensional dense subspacesk
-dimensional dense subspaces sorted in
reverse order by their coverageprivate java.util.Collection<CLIQUEUnit> initOneDimensionalUnits(Relation<? extends NumberVector> database)
database
- the database to run the algorithm onprivate void updateMinMax(NumberVector featureVector, double[] minima, double[] maxima)
featureVector
- the feature vectorminima
- the array of minimamaxima
- the array of maximaprivate java.util.List<CLIQUESubspace> findOneDimensionalDenseSubspaceCandidates(Relation<? extends NumberVector> database)
database
- the database to run the algorithm onprivate java.util.List<CLIQUESubspace> findDenseSubspaceCandidates(Relation<? extends NumberVector> database, java.util.List<CLIQUESubspace> denseSubspaces)
k
-dimensional dense subspace candidates from the
specified (k-1)
-dimensional dense subspaces.database
- the database to run the algorithm ondenseSubspaces
- the (k-1)
-dimensional dense subspacesk
-dimensional dense subspace candidates
reverse ordered by their coverageprivate java.util.List<CLIQUESubspace> pruneDenseSubspaces(java.util.List<CLIQUESubspace> denseSubspaces)
denseSubspaces
- the subspaces to be pruned sorted in reverse order by
their coverageprivate int[][] computeMeans(java.util.List<CLIQUESubspace> denseSubspaces)
denseSubspaces
- the dense subspaces in reverse order by their
coverageprivate double[][] computeDiffs(java.util.List<CLIQUESubspace> denseSubspaces, int[] mi, int[] mp)
denseSubspaces
- denseSubspaces the dense subspaces in reverse order
by their coveragemi
- the mean of the selected sets Imp
- the mean of the pruned sets Pprivate static double log2OrZero(double x)
x
- Input valuepublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<SubspaceModel>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<SubspaceModel>>
Copyright © 2019 ELKI Development Team. License information.