de.lmu.ifi.dbs.elki.algorithm.outlier
Class LOF<O,D extends NumberDistance<D,?>>

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<OutlierResult>
      extended by de.lmu.ifi.dbs.elki.algorithm.outlier.LOF<O,D>
Type Parameters:
O - the type of DatabaseObjects handled by this Algorithm
D - Distance type
All Implemented Interfaces:
Algorithm, OutlierAlgorithm, InspectionUtilFrequentlyScanned, Parameterizable
Direct Known Subclasses:
OnlineLOF

@Title(value="LOF: Local Outlier Factor")
@Description(value="Algorithm to compute density-based local outlier factors in a database based on the neighborhood size parameter \'k\'")
@Reference(authors="M. M. Breunig, H.-P. Kriegel, R. Ng, and J. Sander",
           title="LOF: Identifying Density-Based Local Outliers",
           booktitle="Proc. 2nd ACM SIGMOD Int. Conf. on Management of Data (SIGMOD \'00), Dallas, TX, 2000",
           url="http://dx.doi.org/10.1145/342009.335388")
public class LOF<O,D extends NumberDistance<D,?>>
extends AbstractAlgorithm<OutlierResult>
implements OutlierAlgorithm

Algorithm to compute density-based local outlier factors in a database based on a specified parameter K_ID (-lof.k).

This implementation diverts from the original LOF publication in that it allows the user to use a different distance function for the reachability distance and neighborhood determination (although the default is to use the same value.)

The k nearest neighbors are determined using the parameter AbstractDistanceBasedAlgorithm.DISTANCE_FUNCTION_ID , while the reference set used in reachability distance computation is configured using REACHABILITY_DISTANCE_FUNCTION_ID.

The original LOF parameter was called "minPts". Since kNN queries in ELKI have slightly different semantics - exactly k neighbors are returned - we chose to rename the parameter to K_ID (-lof.k) to reflect this difference.

Reference:
M. M. Breunig, H.-P. Kriegel, R. Ng, J. Sander: LOF: Identifying Density-Based Local Outliers.
In: Proc. 2nd ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'00), Dallas, TX, 2000.


Nested Class Summary
static class LOF.LOFResult<O,D extends NumberDistance<D,?>>
          Encapsulates information like the neighborhood, the LRD and LOF values of the objects during a run of the LOF algorithm.
static class LOF.Parameterizer<O,D extends NumberDistance<D,?>>
          Parameterization class.
 
Field Summary
protected  int k
          Holds the value of K_ID.
static OptionID K_ID
          Parameter to specify the number of nearest neighbors of an object to be considered for computing its LOF_SCORE, must be an integer greater than 1.
private static Logging logger
          The logger for this class.
protected  DistanceFunction<? super O,D> neighborhoodDistanceFunction
          Neighborhood distance function.
private static boolean objectIsInKNN
          Include object itself in kNN neighborhood.
static OptionID REACHABILITY_DISTANCE_FUNCTION_ID
          The distance function to determine the reachability distance between database objects.
protected  DistanceFunction<? super O,D> reachabilityDistanceFunction
          Reachability distance function.
 
Constructor Summary
LOF(int k, DistanceFunction<? super O,D> neighborhoodDistanceFunction, DistanceFunction<? super O,D> reachabilityDistanceFunction)
          Constructor.
 
Method Summary
protected  Pair<WritableDataStore<Double>,DoubleMinMax> computeLOFs(DBIDs ids, DataStore<Double> lrds, KNNQuery<O,D> knnRefer)
          Computes the Local outlier factor (LOF) of the specified objects.
protected  WritableDataStore<Double> computeLRDs(DBIDs ids, KNNQuery<O,D> knnReach)
          Computes the local reachability density (LRD) of the specified objects.
protected  LOF.LOFResult<O,D> doRunInTime(KNNQuery<O,D> kNNRefer, KNNQuery<O,D> kNNReach, StepProgress stepprog)
          Performs the Generalized LOF_SCORE algorithm on the given database and returns a LOF.LOFResult encapsulating information that may be needed by an OnlineLOF algorithm.
 TypeInformation[] getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
private  Pair<KNNQuery<O,D>,KNNQuery<O,D>> getKNNQueries(Relation<O> relation, StepProgress stepprog)
          Get the kNN queries for the algorithm.
protected  Logging getLogger()
          Get the (STATIC) logger for this class.
 OutlierResult run(Relation<O> relation)
          Performs the Generalized LOF_SCORE algorithm on the given database by calling #doRunInTime(Database).
 
Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
makeParameterDistanceFunction, run
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.algorithm.outlier.OutlierAlgorithm
run
 

Field Detail

logger

private static final Logging logger
The logger for this class.


REACHABILITY_DISTANCE_FUNCTION_ID

public static final OptionID REACHABILITY_DISTANCE_FUNCTION_ID
The distance function to determine the reachability distance between database objects.


K_ID

public static final OptionID K_ID
Parameter to specify the number of nearest neighbors of an object to be considered for computing its LOF_SCORE, must be an integer greater than 1.


k

protected int k
Holds the value of K_ID.


neighborhoodDistanceFunction

protected DistanceFunction<? super O,D extends NumberDistance<D,?>> neighborhoodDistanceFunction
Neighborhood distance function.


reachabilityDistanceFunction

protected DistanceFunction<? super O,D extends NumberDistance<D,?>> reachabilityDistanceFunction
Reachability distance function.


objectIsInKNN

private static boolean objectIsInKNN
Include object itself in kNN neighborhood. In the official LOF publication, the point itself is not considered to be part of its k nearest neighbors.

Constructor Detail

LOF

public LOF(int k,
           DistanceFunction<? super O,D> neighborhoodDistanceFunction,
           DistanceFunction<? super O,D> reachabilityDistanceFunction)
Constructor.

Parameters:
k - the value of k
neighborhoodDistanceFunction - the neighborhood distance function
reachabilityDistanceFunction - the reachability distance function
Method Detail

run

public OutlierResult run(Relation<O> relation)
Performs the Generalized LOF_SCORE algorithm on the given database by calling #doRunInTime(Database).

Parameters:
relation - Data to process

getKNNQueries

private Pair<KNNQuery<O,D>,KNNQuery<O,D>> getKNNQueries(Relation<O> relation,
                                                        StepProgress stepprog)
Get the kNN queries for the algorithm.

Parameters:
relation - the data
stepprog - the progress logger
Returns:
the kNN queries for the algorithm

doRunInTime

protected LOF.LOFResult<O,D> doRunInTime(KNNQuery<O,D> kNNRefer,
                                         KNNQuery<O,D> kNNReach,
                                         StepProgress stepprog)
                                                              throws IllegalStateException
Performs the Generalized LOF_SCORE algorithm on the given database and returns a LOF.LOFResult encapsulating information that may be needed by an OnlineLOF algorithm.

Parameters:
kNNRefer - the kNN query w.r.t. reference neighborhood distance function
kNNReach - the kNN query w.r.t. reachability distance function
Throws:
IllegalStateException

computeLRDs

protected WritableDataStore<Double> computeLRDs(DBIDs ids,
                                                KNNQuery<O,D> knnReach)
Computes the local reachability density (LRD) of the specified objects.

Parameters:
ids - the ids of the objects
knnReach - the precomputed neighborhood of the objects w.r.t. the reachability distance
Returns:
the LRDs of the objects

computeLOFs

protected Pair<WritableDataStore<Double>,DoubleMinMax> computeLOFs(DBIDs ids,
                                                                   DataStore<Double> lrds,
                                                                   KNNQuery<O,D> knnRefer)
Computes the Local outlier factor (LOF) of the specified objects.

Parameters:
ids - the ids of the objects
lrds - the LRDs of the objects
knnRefer - the precomputed neighborhood of the objects w.r.t. the reference distance
Returns:
the LOFs of the objects and the maximum LOF

getInputTypeRestriction

public TypeInformation[] getInputTypeRestriction()
Description copied from class: AbstractAlgorithm
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in interface Algorithm
Specified by:
getInputTypeRestriction in class AbstractAlgorithm<OutlierResult>
Returns:
Type restriction

getLogger

protected Logging getLogger()
Description copied from class: AbstractAlgorithm
Get the (STATIC) logger for this class.

Specified by:
getLogger in class AbstractAlgorithm<OutlierResult>
Returns:
the static logger

Release 0.4.0 (2011-09-20_1324)