|
|
|||||||||||||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.lmu.ifi.dbs.elki.logging.AbstractLoggable
de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<V,Clustering<SubspaceModel<V>>>
de.lmu.ifi.dbs.elki.algorithm.clustering.subspace.DiSH<V>
V
- the type of NumberVector handled by this Algorithm@Title(value="DiSH: Detecting Subspace cluster Hierarchies") @Description(value="Algorithm to find hierarchical correlation clusters in subspaces.") @Reference(authors="E. Achtert, C. B\u00f6hm, H.-P. Kriegel, P. Kr\u00f6ger, I. M\u00fcller-Gorman, A. Zimek", title="Detection and Visualization of Subspace Cluster Hierarchies", booktitle="Proc. 12th International Conference on Database Systems for Advanced Applications (DASFAA), Bangkok, Thailand, 2007", url="http://dx.doi.org/10.1007/978-3-540-71703-4_15") public class DiSH<V extends NumberVector<V,?>>
Algorithm for detecting subspace hierarchies.
Reference:
E. Achtert, C. Böhm, H.-P. Kriegel, P. Kröger, I. Müller-Gorman, A. Zimek:
Detection and Visualization of Subspace Cluster Hierarchies.
In Proc. 12th International Conference on Database Systems for Advanced
Applications (DASFAA), Bangkok, Thailand, 2007.
Field Summary | |
---|---|
private double |
epsilon
Holds the value of EPSILON_PARAM . |
static OptionID |
EPSILON_ID
OptionID for EPSILON_PARAM |
private DoubleParameter |
EPSILON_PARAM
Parameter that specifies the maximum radius of the neighborhood to be considered in each dimension for determination of the preference vector, must be a double equal to or greater than 0. |
static OptionID |
MU_ID
OptionID for MU_PARAM |
private IntParameter |
MU_PARAM
Parameter that specifies the a minimum number of points as a smoothing factor to avoid the single-link-effect, must be an integer greater than 0. |
private OPTICS<V,PreferenceVectorBasedCorrelationDistance> |
optics
The optics algorithm to determine the cluster order. |
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable |
---|
debug, logger |
Constructor Summary | |
---|---|
DiSH(Parameterization config)
Constructor, adhering to Parameterizable |
Method Summary | |
---|---|
private void |
buildHierarchy(Database<V> database,
DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction,
List<Cluster<SubspaceModel<V>>> clusters,
int dimensionality)
Builds the cluster hierarchy |
private void |
checkClusters(Database<V> database,
DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction,
Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> clustersMap)
Removes the clusters with size < minpts from the cluster map and adds them to their parents. |
private Clustering<SubspaceModel<V>> |
computeClusters(Database<V> database,
ClusterOrderResult<PreferenceVectorBasedCorrelationDistance> clusterOrder)
Computes the hierarchical clusters according to the cluster order. |
private Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> |
extractClusters(Database<V> database,
DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction,
ClusterOrderResult<PreferenceVectorBasedCorrelationDistance> clusterOrder)
Extracts the clusters from the cluster order. |
private Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>> |
findParent(Database<V> database,
DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction,
Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>> child,
Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> clustersMap)
Returns the parent of the specified cluster |
private boolean |
isParent(Database<V> database,
DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction,
Cluster<SubspaceModel<V>> parent,
List<Cluster<SubspaceModel<V>>> children)
Returns true, if the specified parent cluster is a parent of one child of the children clusters. |
protected Clustering<SubspaceModel<V>> |
runInTime(Database<V> database)
Performs the DiSH algorithm on the given database. |
private List<Cluster<SubspaceModel<V>>> |
sortClusters(Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> clustersMap,
int dimensionality)
Returns a sorted list of the clusters w.r.t. the subspace dimensionality in descending order. |
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm |
---|
isTime, isVerbose, run, setTime, setVerbose |
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable |
---|
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm |
---|
run |
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.Algorithm |
---|
setTime, setVerbose |
Field Detail |
---|
public static final OptionID EPSILON_ID
EPSILON_PARAM
private final DoubleParameter EPSILON_PARAM
Default value: 0.001
Key: -dish.epsilon
private double epsilon
EPSILON_PARAM
.
public static final OptionID MU_ID
MU_PARAM
private final IntParameter MU_PARAM
Default value: 1
Key: -dish.mu
private OPTICS<V extends NumberVector<V,?>,PreferenceVectorBasedCorrelationDistance> optics
Constructor Detail |
---|
public DiSH(Parameterization config)
Parameterizable
config
- ParameterizationMethod Detail |
---|
protected Clustering<SubspaceModel<V>> runInTime(Database<V> database) throws IllegalStateException
runInTime
in class AbstractAlgorithm<V extends NumberVector<V,?>,Clustering<SubspaceModel<V extends NumberVector<V,?>>>>
database
- the database to run the algorithm on
IllegalStateException
- if the algorithm has not been initialized
properly (e.g. the setParameters(String[]) method has been failed
to be called).private Clustering<SubspaceModel<V>> computeClusters(Database<V> database, ClusterOrderResult<PreferenceVectorBasedCorrelationDistance> clusterOrder)
database
- the database holding the objectsclusterOrder
- the cluster orderprivate Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> extractClusters(Database<V> database, DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction, ClusterOrderResult<PreferenceVectorBasedCorrelationDistance> clusterOrder)
database
- the database storing the objectsdistanceFunction
- the distance functionclusterOrder
- the cluster order to extract the clusters from
private List<Cluster<SubspaceModel<V>>> sortClusters(Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> clustersMap, int dimensionality)
clustersMap
- the mapping of bits sets to clustersdimensionality
- the dimensionality of the data objects
private void checkClusters(Database<V> database, DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction, Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> clustersMap)
database
- the database storing the objectsdistanceFunction
- the distance functionclustersMap
- the map containing the clustersprivate Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>> findParent(Database<V> database, DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction, Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>> child, Map<BitSet,List<Pair<BitSet,DatabaseObjectGroupCollection<List<Integer>>>>> clustersMap)
database
- the database storing the objectsdistanceFunction
- the distance functionchild
- the child to search the parent forclustersMap
- the map containing the clusters
private void buildHierarchy(Database<V> database, DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction, List<Cluster<SubspaceModel<V>>> clusters, int dimensionality)
distanceFunction
- the distance functionclusters
- the sorted list of clustersdimensionality
- the dimensionality of the datadatabase
- the database containing the data objectsprivate boolean isParent(Database<V> database, DiSHDistanceFunction<V,DiSHPreprocessor<V>> distanceFunction, Cluster<SubspaceModel<V>> parent, List<Cluster<SubspaceModel<V>>> children)
database
- the database containing the objectsdistanceFunction
- the distance function for distance computation
between the clustersparent
- the parent to be testedchildren
- the list of children to be tested
|
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |