de.lmu.ifi.dbs.elki.algorithm.outlier
Class ReferenceBasedOutlierDetection<V extends NumberVector<?,?>,D extends NumberDistance<D,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<OutlierResult>
      extended by de.lmu.ifi.dbs.elki.algorithm.outlier.ReferenceBasedOutlierDetection<V,D>
Type Parameters:
V - a type of NumberVector as a suitable data object for this algorithm
D - the distance type processed
All Implemented Interfaces:
Algorithm, OutlierAlgorithm, InspectionUtilFrequentlyScanned, Parameterizable

@Title(value="An Efficient Reference-based Approach to Outlier Detection in Large Datasets")
@Description(value="Computes kNN distances approximately, using reference points with various reference point strategies.")
@Reference(authors="Y. Pei, O.R. Zaiane, Y. Gao",
           title="An Efficient Reference-based Approach to Outlier Detection in Large Datasets",
           booktitle="Proc. 19th IEEE Int. Conf. on Data Engineering (ICDE \'03), Bangalore, India, 2003",
           url="http://dx.doi.org/10.1109/ICDM.2006.17")
public class ReferenceBasedOutlierDetection<V extends NumberVector<?,?>,D extends NumberDistance<D,?>>
extends AbstractAlgorithm<OutlierResult>
implements OutlierAlgorithm

provides the Reference-Based Outlier Detection algorithm, an algorithm that computes kNN distances approximately, using reference points.

Reference:
Y. Pei, O. R. Zaiane, Y. Gao: An Efficient Reference-Based Approach to Outlier Detection in Large Datasets.
In: Proc. IEEE Int. Conf. on Data Mining (ICDM'06), Hong Kong, China, 2006.


Nested Class Summary
static class ReferenceBasedOutlierDetection.Parameterizer<V extends NumberVector<?,?>,D extends NumberDistance<D,?>>
          Parameterization class.
 
Field Summary
private  DistanceFunction<V,D> distanceFunction
          Distance function to use.
private  int k
          Holds the value of K_ID.
static OptionID K_ID
          Parameter to specify the number of nearest neighbors of an object, to be considered for computing its REFOD_SCORE, must be an integer greater than 1.
private static Logging logger
          The logger for this class.
private  ReferencePointsHeuristic<V> refp
          Stores the reference point strategy
static OptionID REFP_ID
          Parameter for the reference points heuristic.
 
Constructor Summary
ReferenceBasedOutlierDetection(int k, DistanceFunction<V,D> distanceFunction, ReferencePointsHeuristic<V> refp)
          Constructor with parameters.
 
Method Summary
protected  double computeDensity(List<DistanceResultPair<D>> referenceDists, int index)
          Computes the density of an object.
protected  List<DistanceResultPair<D>> computeDistanceVector(V refPoint, Relation<V> database, DistanceQuery<V,D> distFunc)
          Computes for each object the distance to one reference point.
 TypeInformation[] getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
protected  Logging getLogger()
          Get the (STATIC) logger for this class.
 OutlierResult run(Relation<V> relation)
          Run the algorithm on the given relation.
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
makeParameterDistanceFunction, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.outlier.OutlierAlgorithm
run
 

Field Detail

logger

private static final Logging logger
The logger for this class.


REFP_ID

public static final OptionID REFP_ID
Parameter for the reference points heuristic.


K_ID

public static final OptionID K_ID
Parameter to specify the number of nearest neighbors of an object, to be considered for computing its REFOD_SCORE, must be an integer greater than 1.


k

private int k
Holds the value of K_ID.


refp

private ReferencePointsHeuristic<V extends NumberVector<?,?>> refp
Stores the reference point strategy


distanceFunction

private DistanceFunction<V extends NumberVector<?,?>,D extends NumberDistance<D,?>> distanceFunction
Distance function to use.

Constructor Detail

ReferenceBasedOutlierDetection

public ReferenceBasedOutlierDetection(int k,
                                      DistanceFunction<V,D> distanceFunction,
                                      ReferencePointsHeuristic<V> refp)
Constructor with parameters.

Parameters:
k - k Parameter
distanceFunction - distance function
refp - Reference points heuristic
Method Detail

run

public OutlierResult run(Relation<V> relation)
Run the algorithm on the given relation.

Parameters:
relation - Relation to process
Returns:
Outlier result

computeDistanceVector

protected List<DistanceResultPair<D>> computeDistanceVector(V refPoint,
                                                            Relation<V> database,
                                                            DistanceQuery<V,D> distFunc)
Computes for each object the distance to one reference point. (one dimensional representation of the data set)

Parameters:
refPoint - Reference Point Feature Vector
database - database to work on
distFunc - Distance function to use
Returns:
array containing the distance to one reference point for each database object and the object id

computeDensity

protected double computeDensity(List<DistanceResultPair<D>> referenceDists,
                                int index)
Computes the density of an object. The density of an object is the distances to the k nearest neighbors. Neighbors and distances are computed approximately. (approximation for kNN distance: instead of a normal NN search the NN of an object are those objects that have a similar distance to a reference point. The k- nearest neighbors of an object are those objects that lay close to the object in the reference distance vector)

Parameters:
referenceDists - vector of the reference distances,
index - index of the current object
Returns:
density for one object and reference point

getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<OutlierResult>
Returns:
Type restriction

getLogger

protected Logging getLogger()
Description copied from class: AbstractAlgorithm
Get the (STATIC) logger for this class.

Specified by:
getLogger in class AbstractAlgorithm<OutlierResult>
Returns:
the static logger

Release 0.4.0 (2011-09-20_1324)