
O - Object typeD - Distance type@Title(value="Distance Histogram") @Description(value="Computes a histogram over the distances occurring in the data set.") public class DistanceStatisticsWithClasses<O,D extends NumberDistance<D,?>> extends AbstractDistanceBasedAlgorithm<O,D,CollectionResult<DoubleVector>>
| Modifier and Type | Class and Description | 
|---|---|
| static class  | DistanceStatisticsWithClasses.Parameterizer<O,D extends NumberDistance<D,?>>Parameterization class. | 
| Modifier and Type | Field and Description | 
|---|---|
| private boolean | exactCompute exactly (slower). | 
| static OptionID | EXACT_IDFlag to compute exact value range for binning. | 
| static OptionID | HISTOGRAM_BINS_IDOption to configure the number of bins to use. | 
| private static Logging | LOGThe logger for this class. | 
| private int | numbinNumber of bins to use in sampling. | 
| private boolean | samplingSampling flag. | 
| static OptionID | SAMPLING_IDFlag to enable sampling. | 
DISTANCE_FUNCTION_ID| Constructor and Description | 
|---|
| DistanceStatisticsWithClasses(DistanceFunction<? super O,D> distanceFunction,
                             int numbins,
                             boolean exact,
                             boolean sampling)Constructor. | 
| Modifier and Type | Method and Description | 
|---|---|
| private DoubleMinMax | exactMinMax(Relation<O> relation,
           DistanceQuery<O,D> distFunc)Compute the exact maximum and minimum. | 
| TypeInformation[] | getInputTypeRestriction()Get the input type restriction used for negotiating the data query. | 
| protected Logging | getLogger()Get the (STATIC) logger for this class. | 
| HistogramResult<DoubleVector> | run(Database database)Runs the algorithm. | 
| private DoubleMinMax | sampleMinMax(Relation<O> relation,
            DistanceQuery<O,D> distFunc)Estimate minimum and maximum via sampling. | 
| private static void | shrinkHeap(TreeSet<DoubleDBIDPair> hotset,
          int k)Shrink the heap of "hot" (extreme) items. | 
getDistanceFunctionmakeParameterDistanceFunctionprivate static final Logging LOG
public static final OptionID EXACT_ID
public static final OptionID SAMPLING_ID
public static final OptionID HISTOGRAM_BINS_ID
private int numbin
private boolean sampling
private boolean exact
public DistanceStatisticsWithClasses(DistanceFunction<? super O,D> distanceFunction, int numbins, boolean exact, boolean sampling)
distanceFunction - Distance function to usenumbins - Number of binsexact - Exactness flagsampling - Sampling flagpublic HistogramResult<DoubleVector> run(Database database)
Algorithmrun in interface Algorithmrun in class AbstractAlgorithm<CollectionResult<DoubleVector>>database - the database to run the algorithm onprivate DoubleMinMax sampleMinMax(Relation<O> relation, DistanceQuery<O,D> distFunc)
relation - Relation to processdistFunc - Distance function to useprivate DoubleMinMax exactMinMax(Relation<O> relation, DistanceQuery<O,D> distFunc)
relation - Relation to processdistFunc - Distance functionprivate static void shrinkHeap(TreeSet<DoubleDBIDPair> hotset, int k)
hotset - Set of hot itemsk - target sizepublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithmgetInputTypeRestriction in interface AlgorithmgetInputTypeRestriction in class AbstractAlgorithm<CollectionResult<DoubleVector>>protected Logging getLogger()
AbstractAlgorithmgetLogger in class AbstractAlgorithm<CollectionResult<DoubleVector>>