Environment for
DeveLoping
KDD-Applications
Supported by Index-Structures

de.lmu.ifi.dbs.elki.algorithm.clustering
Class SNNClustering<O extends DatabaseObject,D extends Distance<D>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.logging.AbstractLoggable
      extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<O,Clustering<Model>>
          extended by de.lmu.ifi.dbs.elki.algorithm.clustering.SNNClustering<O,D>
Type Parameters:
O - the type of DatabaseObject the algorithm is applied on
D - the type of Distance used for the preprocessing of the shared nearest neighbors neighborhood lists
All Implemented Interfaces:
Algorithm<O,Clustering<Model>>, ClusteringAlgorithm<Clustering<Model>,O>, Parameterizable

@Title(value="SNN: Shared Nearest Neighbor Clustering")
@Description(value="Algorithm to find shared-nearest-neighbors-density-connected sets in a database based on the parameters \'minPts\' and \'epsilon\' (specifying a volume). These two parameters determine a density threshold for clustering.")
@Reference(authors="L. Ert\u00f6z, M. Steinbach, V. Kumar",
           title="Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data",
           booktitle="Proc. of SIAM Data Mining (SDM), 2003",
           url="http://www.siam.org/meetings/sdm03/proceedings/sdm03_05.pdf")
public class SNNClustering<O extends DatabaseObject,D extends Distance<D>>
extends AbstractAlgorithm<O,Clustering<Model>>
implements ClusteringAlgorithm<Clustering<Model>,O>

Shared nearest neighbor clustering.

Reference: L. Ertöz, M. Steinbach, V. Kumar: Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data.
In: Proc. of SIAM Data Mining (SDM), 2003.

Author:
Arthur Zimek

Field Summary
private  IntegerDistance epsilon
          Holds the value of EPSILON_PARAM.
static OptionID EPSILON_ID
          OptionID for EPSILON_PARAM
private  IntParameter EPSILON_PARAM
          Parameter to specify the minimum SNN density, must be an integer greater than 0.
private  int minpts
          Holds the value of MINPTS_PARAM.
static OptionID MINPTS_ID
          OptionID for MINPTS_PARAM
private  IntParameter MINPTS_PARAM
          Parameter to specify the threshold for minimum number of points in the epsilon-SNN-neighborhood of a point, must be an integer greater than 0.
protected  Set<Integer> noise
          Holds a set of noise.
protected  Set<Integer> processedIDs
          Holds a set of processed ids.
protected  List<List<Integer>> resultList
          Holds a list of clusters found.
private  SharedNearestNeighborSimilarityFunction<O,D> similarityFunction
          The similarity function for the shared nearest neighbor similarity.
 
Fields inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debug, logger
 
Constructor Summary
SNNClustering(Parameterization config)
          Constructor, adhering to Parameterizable
 
Method Summary
protected  void expandCluster(Database<O> database, Integer startObjectID, FiniteProgress objprog, IndefiniteProgress clusprog)
          DBSCAN-function expandCluster adapted to SNN criterion.
protected  List<Integer> findSNNNeighbors(Database<O> database, Integer queryObject)
          Returns the shared nearest neighbors of the specified query object in the given database.
 IntegerDistance getEpsilon()
          Returns the value of EPSILON_PARAM.
protected  Clustering<Model> runInTime(Database<O> database)
          Performs the SNN clustering algorithm on the given database.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
isTime, isVerbose, run, setTime, setVerbose
 
Methods inherited from class de.lmu.ifi.dbs.elki.logging.AbstractLoggable
debugFine, debugFiner, debugFinest, exception, progress, verbose, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.clustering.ClusteringAlgorithm
run
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.Algorithm
setTime, setVerbose
 

Field Detail

EPSILON_ID

public static final OptionID EPSILON_ID
OptionID for EPSILON_PARAM


EPSILON_PARAM

private final IntParameter EPSILON_PARAM
Parameter to specify the minimum SNN density, must be an integer greater than 0.

Key: -snn.epsilon


epsilon

private IntegerDistance epsilon
Holds the value of EPSILON_PARAM.


MINPTS_ID

public static final OptionID MINPTS_ID
OptionID for MINPTS_PARAM


MINPTS_PARAM

private final IntParameter MINPTS_PARAM
Parameter to specify the threshold for minimum number of points in the epsilon-SNN-neighborhood of a point, must be an integer greater than 0.

Key: -snn.minpts


minpts

private int minpts
Holds the value of MINPTS_PARAM.


resultList

protected List<List<Integer>> resultList
Holds a list of clusters found.


noise

protected Set<Integer> noise
Holds a set of noise.


processedIDs

protected Set<Integer> processedIDs
Holds a set of processed ids.


similarityFunction

private SharedNearestNeighborSimilarityFunction<O extends DatabaseObject,D extends Distance<D>> similarityFunction
The similarity function for the shared nearest neighbor similarity.

Constructor Detail

SNNClustering

public SNNClustering(Parameterization config)
Constructor, adhering to Parameterizable

Parameters:
config - Parameterization
Method Detail

runInTime

protected Clustering<Model> runInTime(Database<O> database)
Performs the SNN clustering algorithm on the given database.

Specified by:
runInTime in class AbstractAlgorithm<O extends DatabaseObject,Clustering<Model>>
Parameters:
database - the database to run the algorithm on
Returns:
the Result computed by this algorithm

findSNNNeighbors

protected List<Integer> findSNNNeighbors(Database<O> database,
                                         Integer queryObject)
Returns the shared nearest neighbors of the specified query object in the given database.

Parameters:
database - the database holding the objects
queryObject - the query object
Returns:
the shared nearest neighbors of the specified query object in the given database

expandCluster

protected void expandCluster(Database<O> database,
                             Integer startObjectID,
                             FiniteProgress objprog,
                             IndefiniteProgress clusprog)
DBSCAN-function expandCluster adapted to SNN criterion.

Border-Objects become members of the first possible cluster.

Parameters:
database - the database on which the algorithm is run
startObjectID - potential seed of a new potential cluster
objprog - the progress object to report about the progress of clustering

getEpsilon

public IntegerDistance getEpsilon()
Returns the value of EPSILON_PARAM.

Returns:
the value of EPSILON_PARAM

Release 0.3 (2010-03-31_1612)