V
- the type of NumberVector handled by this Algorithm@Title(value="ORCLUS: Arbitrarily ORiented projected CLUSter generation") @Description(value="Algorithm to find correlation clusters in high dimensional spaces.") @Reference(authors="C. C. Aggarwal, P. S. Yu", title="Finding Generalized Projected Clusters in High Dimensional Spaces", booktitle="Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD \'00)", url="http://dx.doi.org/10.1145/342009.335383") public class ORCLUS<V extends NumberVector> extends AbstractProjectedClustering<Clustering<Model>,V>
Reference: C. C. Aggarwal, P. S. Yu: Finding Generalized Projected Clusters
in High Dimensional Spaces.
In: Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD '00).
Modifier and Type | Class and Description |
---|---|
private class |
ORCLUS.ORCLUSCluster
Encapsulates the attributes of a cluster.
|
static class |
ORCLUS.Parameterizer<V extends NumberVector>
Parameterization class.
|
private class |
ORCLUS.ProjectedEnergy
Encapsulates the projected energy for a cluster.
|
Modifier and Type | Field and Description |
---|---|
private double |
alpha
Holds the value of
ORCLUS.Parameterizer.ALPHA_ID . |
private static Logging |
LOG
The logger for this class.
|
private PCARunner |
pca
The PCA utility object.
|
private RandomFactory |
rnd
Random generator
|
k, k_i, l
Constructor and Description |
---|
ORCLUS(int k,
int k_i,
int l,
double alpha,
RandomFactory rnd,
PCARunner pca)
Java constructor.
|
Modifier and Type | Method and Description |
---|---|
private void |
assign(Relation<V> database,
DistanceQuery<V> distFunc,
List<ORCLUS.ORCLUSCluster> clusters)
Creates a partitioning of the database by assigning each object to its
closest seed.
|
private Matrix |
findBasis(Relation<V> database,
DistanceQuery<V> distFunc,
ORCLUS.ORCLUSCluster cluster,
int dim)
Finds the basis of the subspace of dimensionality
dim for the
specified cluster. |
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private List<ORCLUS.ORCLUSCluster> |
initialSeeds(Relation<V> database,
int k)
Initializes the list of seeds wit a random sample of size k.
|
private void |
merge(Relation<V> database,
DistanceQuery<V> distFunc,
List<ORCLUS.ORCLUSCluster> clusters,
int k_new,
int d_new,
IndefiniteProgress cprogress)
Reduces the number of seeds to k_new
|
private ORCLUS.ProjectedEnergy |
projectedEnergy(Relation<V> database,
DistanceQuery<V> distFunc,
ORCLUS.ORCLUSCluster c_i,
ORCLUS.ORCLUSCluster c_j,
int i,
int j,
int dim)
Computes the projected energy of the specified clusters.
|
private V |
projection(ORCLUS.ORCLUSCluster c,
V o,
NumberVector.Factory<V> factory)
Returns the projection of real vector o in the subspace of cluster c.
|
Clustering<Model> |
run(Database database,
Relation<V> relation)
Performs the ORCLUS algorithm on the given database.
|
private ORCLUS.ORCLUSCluster |
union(Relation<V> relation,
DistanceQuery<V> distFunc,
ORCLUS.ORCLUSCluster c1,
ORCLUS.ORCLUSCluster c2,
int dim)
Returns the union of the two specified clusters.
|
getDistanceFunction, getDistanceQuery
makeParameterDistanceFunction, run
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
run
private static final Logging LOG
private double alpha
ORCLUS.Parameterizer.ALPHA_ID
.private RandomFactory rnd
private PCARunner pca
public ORCLUS(int k, int k_i, int l, double alpha, RandomFactory rnd, PCARunner pca)
k
- k Parameterk_i
- k_i Parameterl
- l Parameteralpha
- Alpha Parameterrnd
- Random generatorpca
- PCA runnerpublic Clustering<Model> run(Database database, Relation<V> relation)
database
- Databaserelation
- Relationprivate List<ORCLUS.ORCLUSCluster> initialSeeds(Relation<V> database, int k)
database
- the database holding the objectsk
- the size of the random sampleprivate void assign(Relation<V> database, DistanceQuery<V> distFunc, List<ORCLUS.ORCLUSCluster> clusters)
database
- the database holding the objectsdistFunc
- distance functionclusters
- the array of clusters to which the objects should be
assigned toprivate Matrix findBasis(Relation<V> database, DistanceQuery<V> distFunc, ORCLUS.ORCLUSCluster cluster, int dim)
dim
for the
specified cluster.database
- the database to run the algorithm ondistFunc
- the distance functioncluster
- the clusterdim
- the dimensionality of the subspaceprivate void merge(Relation<V> database, DistanceQuery<V> distFunc, List<ORCLUS.ORCLUSCluster> clusters, int k_new, int d_new, IndefiniteProgress cprogress)
database
- the database holding the objectsdistFunc
- the distance functionclusters
- the set of current seedsk_new
- the new number of seedsd_new
- the new dimensionality of the subspaces for each seedprivate ORCLUS.ProjectedEnergy projectedEnergy(Relation<V> database, DistanceQuery<V> distFunc, ORCLUS.ORCLUSCluster c_i, ORCLUS.ORCLUSCluster c_j, int i, int j, int dim)
database
- the database holding the objectsdistFunc
- the distance functionc_i
- the first clusterc_j
- the second clusteri
- the index of cluster c_i in the cluster listj
- the index of cluster c_j in the cluster listdim
- the dimensionality of the clustersprivate ORCLUS.ORCLUSCluster union(Relation<V> relation, DistanceQuery<V> distFunc, ORCLUS.ORCLUSCluster c1, ORCLUS.ORCLUSCluster c2, int dim)
relation
- the database holding the objectsdistFunc
- the distance functionc1
- the first clusterc2
- the second clusterdim
- the dimensionality of the union clusterprivate V projection(ORCLUS.ORCLUSCluster c, V o, NumberVector.Factory<V> factory)
c
- the clustero
- the double vectorfactory
- Factory object / prototypepublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<Clustering<Model>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<Clustering<Model>>
Copyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.