O
- Object typeD
- Distance type@Title(value="Distance Histogram") @Description(value="Computes a histogram over the distances occurring in the data set.") public class DistanceStatisticsWithClasses<O,D extends NumberDistance<D,?>> extends AbstractDistanceBasedAlgorithm<O,D,CollectionResult<DoubleVector>>
Modifier and Type | Class and Description |
---|---|
static class |
DistanceStatisticsWithClasses.Parameterizer<O,D extends NumberDistance<D,?>>
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private boolean |
exact
Sampling
|
static OptionID |
EXACT_ID
Flag to compute exact value range for binning.
|
static OptionID |
HISTOGRAM_BINS_ID
Option to configure the number of bins to use.
|
private static Logging |
logger
The logger for this class.
|
private int |
numbin
Number of bins to use in sampling.
|
private boolean |
sampling
Sampling
|
static OptionID |
SAMPLING_ID
Flag to enable sampling
|
DISTANCE_FUNCTION_ID
Constructor and Description |
---|
DistanceStatisticsWithClasses(DistanceFunction<? super O,D> distanceFunction,
int numbins,
boolean exact,
boolean sampling)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private DoubleMinMax |
exactMinMax(Relation<O> database,
DistanceQuery<O,D> distFunc) |
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
HistogramResult<DoubleVector> |
run(Database database)
Iterates over all points in the database.
|
private DoubleMinMax |
sampleMinMax(Relation<O> database,
DistanceQuery<O,D> distFunc) |
private void |
shrinkHeap(TreeSet<FCPair<Double,DBID>> hotset,
int k) |
getDistanceFunction
makeParameterDistanceFunction
private static final Logging logger
public static final OptionID EXACT_ID
public static final OptionID SAMPLING_ID
public static final OptionID HISTOGRAM_BINS_ID
private int numbin
private boolean sampling
private boolean exact
public DistanceStatisticsWithClasses(DistanceFunction<? super O,D> distanceFunction, int numbins, boolean exact, boolean sampling)
distanceFunction
- Distance function to usenumbins
- Number of binsexact
- Exactness flagsampling
- Sampling flagpublic HistogramResult<DoubleVector> run(Database database) throws IllegalStateException
run
in interface Algorithm
run
in class AbstractAlgorithm<CollectionResult<DoubleVector>>
database
- the database to run the algorithm onIllegalStateException
- if the algorithm has not been initialized
properly (e.g. the setParameters(String[]) method has been failed
to be called).private DoubleMinMax sampleMinMax(Relation<O> database, DistanceQuery<O,D> distFunc)
private DoubleMinMax exactMinMax(Relation<O> database, DistanceQuery<O,D> distFunc)
public TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<CollectionResult<DoubleVector>>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<CollectionResult<DoubleVector>>