@Reference(authors="T. Zhang, R. Ramakrishnan, M. Livny",title="BIRCH: An Efficient Data Clustering Method for Very Large Databases",booktitle="Proc. 1996 ACM SIGMOD International Conference on Management of Data",url="https://doi.org/10.1145/233269.233324",bibkey="DBLP:conf/sigmod/ZhangRL96") @Reference(authors="T. Zhang, R. Ramakrishnan, M. Livny",title="BIRCH: A New Data Clustering Algorithm and Its Applications",booktitle="Data Min. Knowl. Discovery",url="https://doi.org/10.1023/A:1009783824328",bibkey="DBLP:journals/datamine/ZhangRL97") public class CFTree extends java.lang.Object
Important differences:
Condensing and merging refinement are possible, and improvements to this code are welcome - please send a pull request!
References:
T. Zhang, R. Ramakrishnan, M. Livny
BIRCH: An Efficient Data Clustering Method for Very Large Databases
Proc. 1996 ACM SIGMOD International Conference on Management of Data
T. Zhang, R. Ramakrishnan, M. Livny
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Min. Knowl. Discovery
Modifier and Type | Class and Description |
---|---|
static class |
CFTree.Factory
CF-Tree Factory.
|
static class |
CFTree.LeafIterator
Iterator over leaf nodes.
|
static class |
CFTree.TreeNode
Inner node.
|
Modifier and Type | Field and Description |
---|---|
(package private) BIRCHAbsorptionCriterion |
absorption
Criterion for absorbing points.
|
(package private) int |
capacity
Capacity of a node.
|
(package private) BIRCHDistance |
distance
Distance function to use.
|
(package private) int |
leaves
Leaf node counter.
|
static Logging |
LOG
Class logger.
|
(package private) CFTree.TreeNode |
root
Current root node.
|
(package private) double |
thresholdsq
Squared maximum radius threshold of a clusterin feature.
|
Constructor and Description |
---|
CFTree(BIRCHDistance distance,
BIRCHAbsorptionCriterion absorption,
double threshold,
int capacity)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private boolean |
add(ClusteringFeature[] children,
ClusteringFeature child)
Add a node to the first unused slot.
|
private double |
estimateThreshold(CFTree.TreeNode current) |
private ClusteringFeature |
findLeaf(CFTree.TreeNode node,
NumberVector nv)
Find the leaf of a cluster, to get the final cluster assignment.
|
ClusteringFeature |
findLeaf(NumberVector nv)
Find the leaf of a cluster, to get the final cluster assignment.
|
private CFTree.TreeNode |
insert(CFTree.TreeNode node,
ClusteringFeature nleaf)
Recursive insertion.
|
private CFTree.TreeNode |
insert(CFTree.TreeNode node,
NumberVector nv)
Recursive insertion.
|
void |
insert(NumberVector nv)
Insert a data point into the tree.
|
CFTree.LeafIterator |
leafIterator()
Get an iterator over the leaf nodes.
|
protected java.lang.StringBuilder |
printDebug(java.lang.StringBuilder buf,
ClusteringFeature n,
int d)
Utility function for debugging.
|
protected void |
rebuildTree()
Rebuild the CFTree to condense it to approximately half the size.
|
private CFTree.TreeNode |
split(CFTree.TreeNode node,
ClusteringFeature newchild)
Split an overfull node.
|
public static final Logging LOG
BIRCHDistance distance
BIRCHAbsorptionCriterion absorption
double thresholdsq
int capacity
CFTree.TreeNode root
int leaves
public CFTree(BIRCHDistance distance, BIRCHAbsorptionCriterion absorption, double threshold, int capacity)
distance
- Distance function to useabsorption
- Absorption criterionthreshold
- Thresholdcapacity
- Capacitypublic void insert(NumberVector nv)
nv
- Object dataprotected void rebuildTree()
private double estimateThreshold(CFTree.TreeNode current)
private CFTree.TreeNode insert(CFTree.TreeNode node, NumberVector nv)
node
- Current nodenv
- Object datapublic ClusteringFeature findLeaf(NumberVector nv)
In contrast to insert(de.lmu.ifi.dbs.elki.data.NumberVector)
, this does not modify the tree.
nv
- Object dataprivate ClusteringFeature findLeaf(CFTree.TreeNode node, NumberVector nv)
In contrast to insert(de.lmu.ifi.dbs.elki.data.NumberVector)
, this does not modify the tree.
TODO: allow "outliers"?
node
- Current nodenv
- Object dataprivate CFTree.TreeNode split(CFTree.TreeNode node, ClusteringFeature newchild)
node
- Node to splitnewchild
- Additional childnode
private CFTree.TreeNode insert(CFTree.TreeNode node, ClusteringFeature nleaf)
node
- Current nodenleaf
- Leaf entry to add.private boolean add(ClusteringFeature[] children, ClusteringFeature child)
children
- Children listchild
- Child to addfalse
if node is fullpublic CFTree.LeafIterator leafIterator()
protected java.lang.StringBuilder printDebug(java.lang.StringBuilder buf, ClusteringFeature n, int d)
buf
- Output buffern
- Current noded
- DepthCopyright © 2019 ELKI Development Team. License information.