V
- the type of NumberVector handled by this Algorithm@Title(value="CLIQUE: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications") @Description(value="Grid-based algorithm to identify dense clusters in subspaces of maximum dimensionality.") @Reference(authors="R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan", title="Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications", booktitle="Proc. SIGMOD Conference, Seattle, WA, 1998", url="http://dx.doi.org/10.1145/276304.276314") public class CLIQUE<V extends NumberVector<V,?>> extends AbstractAlgorithm<Clustering<SubspaceModel<V>>> implements ClusteringAlgorithm<Clustering<SubspaceModel<V>>>
Reference:
R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan: Automatic Subspace
Clustering of High Dimensional Data for Data Mining Applications.
In Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998.
Modifier and Type | Class and Description |
---|---|
static class |
CLIQUE.Parameterizer<V extends NumberVector<V,?>>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private static Logging |
logger
The logger for this class.
|
private boolean |
prune
Holds the value of
PRUNE_ID . |
static OptionID |
PRUNE_ID
Flag to indicate that only subspaces with large coverage (i.e. the fraction
of the database that is covered by the dense units) are selected, the rest
will be pruned.
|
private double |
tau
Holds the value of
TAU_ID . |
static OptionID |
TAU_ID
Parameter to specify the density threshold for the selectivity of a unit,
where the selectivity is the fraction of total feature vectors contained in
this unit, must be a double greater than 0 and less than 1.
|
private int |
xsi
Holds the value of
XSI_ID . |
static OptionID |
XSI_ID
Parameter to specify the number of intervals (units) in each dimension,
must be an integer greater than 0.
|
Constructor and Description |
---|
CLIQUE(int xsi,
double tau,
boolean prune)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private double[][] |
computeDiffs(List<CLIQUESubspace<V>> denseSubspaces,
int[] mi,
int[] mp)
The specified sorted list of dense subspaces is divided into the selected
set I and the pruned set P.
|
private int[][] |
computeMeans(List<CLIQUESubspace<V>> denseSubspaces)
The specified sorted list of dense subspaces is divided into the selected
set I and the pruned set P.
|
private List<Pair<Subspace<V>,ModifiableDBIDs>> |
determineClusters(List<CLIQUESubspace<V>> denseSubspaces)
Determines the clusters in the specified dense subspaces.
|
private List<CLIQUESubspace<V>> |
findDenseSubspaceCandidates(Relation<V> database,
List<CLIQUESubspace<V>> denseSubspaces)
Determines the
k -dimensional dense subspace candidates from the
specified (k-1) -dimensional dense subspaces. |
private List<CLIQUESubspace<V>> |
findDenseSubspaces(Relation<V> database,
List<CLIQUESubspace<V>> denseSubspaces)
Determines the
k -dimensional dense subspaces and performs a pruning
if this option is chosen. |
private List<CLIQUESubspace<V>> |
findOneDimensionalDenseSubspaceCandidates(Relation<V> database)
Determines the one-dimensional dense subspace candidates by making a pass
over the database.
|
private List<CLIQUESubspace<V>> |
findOneDimensionalDenseSubspaces(Relation<V> database)
Determines the one dimensional dense subspaces and performs a pruning if
this option is chosen.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private Collection<CLIQUEUnit<V>> |
initOneDimensionalUnits(Relation<V> database)
Initializes and returns the one dimensional units.
|
private List<CLIQUESubspace<V>> |
pruneDenseSubspaces(List<CLIQUESubspace<V>> denseSubspaces)
Performs a MDL-based pruning of the specified dense subspaces as described
in the CLIQUE algorithm.
|
Clustering<SubspaceModel<V>> |
run(Relation<V> relation)
Performs the CLIQUE algorithm on the given database.
|
private void |
updateMinMax(V featureVector,
double[] minima,
double[] maxima)
Updates the minima and maxima array according to the specified feature
vector.
|
makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging logger
public static final OptionID XSI_ID
Key: -clique.xsi
public static final OptionID TAU_ID
Key: -clique.tau
public static final OptionID PRUNE_ID
Key: -clique.prune
private int xsi
XSI_ID
.private double tau
TAU_ID
.private boolean prune
PRUNE_ID
.public CLIQUE(int xsi, double tau, boolean prune)
xsi
- Xsi valuetau
- Tau valueprune
- Prune flagpublic Clustering<SubspaceModel<V>> run(Relation<V> relation)
relation
- Data relation to processprivate List<Pair<Subspace<V>,ModifiableDBIDs>> determineClusters(List<CLIQUESubspace<V>> denseSubspaces)
denseSubspaces
- the dense subspaces in reverse order by their
coverageprivate List<CLIQUESubspace<V>> findOneDimensionalDenseSubspaces(Relation<V> database)
database
- the database to run the algorithm onprivate List<CLIQUESubspace<V>> findDenseSubspaces(Relation<V> database, List<CLIQUESubspace<V>> denseSubspaces)
k
-dimensional dense subspaces and performs a pruning
if this option is chosen.database
- the database to run the algorithm ondenseSubspaces
- the (k-1)
-dimensional dense subspacesk
-dimensional dense subspaces sorted in
reverse order by their coverageprivate Collection<CLIQUEUnit<V>> initOneDimensionalUnits(Relation<V> database)
database
- the database to run the algorithm onprivate void updateMinMax(V featureVector, double[] minima, double[] maxima)
featureVector
- the feature vectorminima
- the array of minimamaxima
- the array of maximaprivate List<CLIQUESubspace<V>> findOneDimensionalDenseSubspaceCandidates(Relation<V> database)
database
- the database to run the algorithm onprivate List<CLIQUESubspace<V>> findDenseSubspaceCandidates(Relation<V> database, List<CLIQUESubspace<V>> denseSubspaces)
k
-dimensional dense subspace candidates from the
specified (k-1)
-dimensional dense subspaces.database
- the database to run the algorithm ondenseSubspaces
- the (k-1)
-dimensional dense subspacesk
-dimensional dense subspace candidates
reverse ordered by their coverageprivate List<CLIQUESubspace<V>> pruneDenseSubspaces(List<CLIQUESubspace<V>> denseSubspaces)
denseSubspaces
- the subspaces to be pruned sorted in reverse order by
their coverageprivate int[][] computeMeans(List<CLIQUESubspace<V>> denseSubspaces)
denseSubspaces
- the dense subspaces in reverse order by their
coverageprivate double[][] computeDiffs(List<CLIQUESubspace<V>> denseSubspaces, int[] mi, int[] mp)
denseSubspaces
- denseSubspaces the dense subspaces in reverse order
by their coveragemi
- the mean of the selected sets Imp
- the mean of the pruned sets Ppublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<SubspaceModel<V extends NumberVector<V,?>>>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<SubspaceModel<V extends NumberVector<V,?>>>>