
O - Object typeD - Distance type@Reference(authors="A. McCallum, K. Nigam, L.H. Ungar", title="Efficient Clustering of High Dimensional Data Sets with Application to Reference Matching", booktitle="Proc. 6th ACM SIGKDD international conference on Knowledge discovery and data mining", url="http://dx.doi.org/10.1145%2F347090.347123") public class CanopyPreClustering<O,D extends Distance<D>> extends AbstractDistanceBasedAlgorithm<O,D,Clustering<ClusterModel>> implements ClusteringAlgorithm<Clustering<ClusterModel>>
Reference:
A. McCallum, K. Nigam, L.H. Ungar
Efficient Clustering of High Dimensional Data Sets with Application to
Reference Matching
Proc. 6th ACM SIGKDD international conference on Knowledge discovery and data
mining
| Modifier and Type | Class and Description |
|---|---|
static class |
CanopyPreClustering.Parameterizer<O,D extends Distance<D>>
Parameterization class
|
| Modifier and Type | Field and Description |
|---|---|
private static Logging |
LOG
Class logger.
|
private D |
t1
Threshold for inclusion
|
private D |
t2
Threshold for removal
|
DISTANCE_FUNCTION_ID| Constructor and Description |
|---|
CanopyPreClustering(DistanceFunction<? super O,D> distanceFunction,
D t1,
D t2)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
Clustering<ClusterModel> |
run(Database database,
Relation<O> relation)
Run the algorithm
|
getDistanceFunctionmakeParameterDistanceFunction, runclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrunprivate static final Logging LOG
public CanopyPreClustering(DistanceFunction<? super O,D> distanceFunction, D t1, D t2)
distanceFunction - Distance functiont1 - Inclusion thresholdt2 - Exclusion thresholdpublic Clustering<ClusterModel> run(Database database, Relation<O> relation)
database - Databaserelation - Relation to processpublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithmgetInputTypeRestriction in interface AlgorithmgetInputTypeRestriction in class AbstractAlgorithm<Clustering<ClusterModel>>protected Logging getLogger()
AbstractAlgorithmgetLogger in class AbstractAlgorithm<Clustering<ClusterModel>>