@Reference(authors="T. Zhang, R. Ramakrishnan, M. Livny",title="BIRCH: An Efficient Data Clustering Method for Very Large Databases",booktitle="Proc. 1996 ACM SIGMOD International Conference on Management of Data",url="https://doi.org/10.1145/233269.233324",bibkey="DBLP:conf/sigmod/ZhangRL96") @Reference(authors="T. Zhang, R. Ramakrishnan, M. Livny",title="BIRCH: A New Data Clustering Algorithm and Its Applications",booktitle="Data Min. Knowl. Discovery",url="https://doi.org/10.1023/A:1009783824328",bibkey="DBLP:journals/datamine/ZhangRL97") public class CFTree extends java.lang.Object
Important differences:
Condensing and merging refinement are possible, and improvements to this code are welcome - please send a pull request!
References:
T. Zhang, R. Ramakrishnan, M. Livny
BIRCH: An Efficient Data Clustering Method for Very Large Databases
Proc. 1996 ACM SIGMOD International Conference on Management of Data
T. Zhang, R. Ramakrishnan, M. Livny
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Min. Knowl. Discovery
| Modifier and Type | Class and Description |
|---|---|
static class |
CFTree.Factory
CF-Tree Factory.
|
static class |
CFTree.LeafIterator
Iterator over leaf nodes.
|
static class |
CFTree.TreeNode
Inner node.
|
| Modifier and Type | Field and Description |
|---|---|
(package private) BIRCHAbsorptionCriterion |
absorption
Criterion for absorbing points.
|
(package private) int |
capacity
Capacity of a node.
|
(package private) BIRCHDistance |
distance
Distance function to use.
|
(package private) int |
leaves
Leaf node counter.
|
static Logging |
LOG
Class logger.
|
(package private) CFTree.TreeNode |
root
Current root node.
|
(package private) double |
thresholdsq
Squared maximum radius threshold of a clusterin feature.
|
| Constructor and Description |
|---|
CFTree(BIRCHDistance distance,
BIRCHAbsorptionCriterion absorption,
double threshold,
int capacity)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
private boolean |
add(ClusteringFeature[] children,
ClusteringFeature child)
Add a node to the first unused slot.
|
private double |
estimateThreshold(CFTree.TreeNode current) |
private ClusteringFeature |
findLeaf(CFTree.TreeNode node,
NumberVector nv)
Find the leaf of a cluster, to get the final cluster assignment.
|
ClusteringFeature |
findLeaf(NumberVector nv)
Find the leaf of a cluster, to get the final cluster assignment.
|
private CFTree.TreeNode |
insert(CFTree.TreeNode node,
ClusteringFeature nleaf)
Recursive insertion.
|
private CFTree.TreeNode |
insert(CFTree.TreeNode node,
NumberVector nv)
Recursive insertion.
|
void |
insert(NumberVector nv)
Insert a data point into the tree.
|
CFTree.LeafIterator |
leafIterator()
Get an iterator over the leaf nodes.
|
protected java.lang.StringBuilder |
printDebug(java.lang.StringBuilder buf,
ClusteringFeature n,
int d)
Utility function for debugging.
|
protected void |
rebuildTree()
Rebuild the CFTree to condense it to approximately half the size.
|
private CFTree.TreeNode |
split(CFTree.TreeNode node,
ClusteringFeature newchild)
Split an overfull node.
|
public static final Logging LOG
BIRCHDistance distance
BIRCHAbsorptionCriterion absorption
double thresholdsq
int capacity
CFTree.TreeNode root
int leaves
public CFTree(BIRCHDistance distance, BIRCHAbsorptionCriterion absorption, double threshold, int capacity)
distance - Distance function to useabsorption - Absorption criterionthreshold - Thresholdcapacity - Capacitypublic void insert(NumberVector nv)
nv - Object dataprotected void rebuildTree()
private double estimateThreshold(CFTree.TreeNode current)
private CFTree.TreeNode insert(CFTree.TreeNode node, NumberVector nv)
node - Current nodenv - Object datapublic ClusteringFeature findLeaf(NumberVector nv)
In contrast to insert(de.lmu.ifi.dbs.elki.data.NumberVector), this does not modify the tree.
nv - Object dataprivate ClusteringFeature findLeaf(CFTree.TreeNode node, NumberVector nv)
In contrast to insert(de.lmu.ifi.dbs.elki.data.NumberVector), this does not modify the tree.
TODO: allow "outliers"?
node - Current nodenv - Object dataprivate CFTree.TreeNode split(CFTree.TreeNode node, ClusteringFeature newchild)
node - Node to splitnewchild - Additional childnodeprivate CFTree.TreeNode insert(CFTree.TreeNode node, ClusteringFeature nleaf)
node - Current nodenleaf - Leaf entry to add.private boolean add(ClusteringFeature[] children, ClusteringFeature child)
children - Children listchild - Child to addfalse if node is fullpublic CFTree.LeafIterator leafIterator()
protected java.lang.StringBuilder printDebug(java.lang.StringBuilder buf,
ClusteringFeature n,
int d)
buf - Output buffern - Current noded - DepthCopyright © 2019 ELKI Development Team. License information.