de.lmu.ifi.dbs.elki.datasource.filter
Class InverseDocumentFrequencyNormalization

java.lang.Object
  extended by de.lmu.ifi.dbs.elki.datasource.filter.AbstractConversionFilter<O,O>
      extended by de.lmu.ifi.dbs.elki.datasource.filter.AbstractNormalization<SparseFloatVector>
          extended by de.lmu.ifi.dbs.elki.datasource.filter.InverseDocumentFrequencyNormalization
All Implemented Interfaces:
Normalization<SparseFloatVector>, ObjectFilter, InspectionUtilFrequentlyScanned, Parameterizable
Direct Known Subclasses:
TFIDFNormalization

public class InverseDocumentFrequencyNormalization
extends AbstractNormalization<SparseFloatVector>

Normalization for text frequency vectors, using the inverse document frequency.


Field Summary
(package private)  Map<Integer,Number> idf
          The IDF storage
(package private)  int objcnt
          The number of objects in the dataset
 
Constructor Summary
InverseDocumentFrequencyNormalization()
          Constructor.
 
Method Summary
protected  SparseFloatVector filterSingleObject(SparseFloatVector featureVector)
          Normalize a single instance.
protected  SimpleTypeInformation<? super SparseFloatVector> getInputTypeRestriction()
          Get the input type restriction used for negotiating the data query.
protected  void prepareComplete()
          Complete the initialization phase
protected  void prepareProcessInstance(SparseFloatVector featureVector)
          Process a single object during initialization.
protected  boolean prepareStart(SimpleTypeInformation<SparseFloatVector> in)
          Return "true" when the normalization needs initialization (two-pass filtering!)
 SparseFloatVector restore(SparseFloatVector featureVector)
          Transforms a feature vector to the original attribute ranges.
 
Methods inherited from class de.lmu.ifi.dbs.elki.datasource.filter.AbstractNormalization
convertedType, normalizeObjects, toString, transform
 
Methods inherited from class de.lmu.ifi.dbs.elki.datasource.filter.AbstractConversionFilter
filter
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface de.lmu.ifi.dbs.elki.datasource.filter.ObjectFilter
filter
 

Field Detail

idf

Map<Integer,Number> idf
The IDF storage


objcnt

int objcnt
The number of objects in the dataset

Constructor Detail

InverseDocumentFrequencyNormalization

public InverseDocumentFrequencyNormalization()
Constructor.

Method Detail

prepareStart

protected boolean prepareStart(SimpleTypeInformation<SparseFloatVector> in)
Description copied from class: AbstractConversionFilter
Return "true" when the normalization needs initialization (two-pass filtering!)

Overrides:
prepareStart in class AbstractConversionFilter<SparseFloatVector,SparseFloatVector>
Parameters:
in - Input type information
Returns:
true or false

prepareProcessInstance

protected void prepareProcessInstance(SparseFloatVector featureVector)
Description copied from class: AbstractConversionFilter
Process a single object during initialization.

Overrides:
prepareProcessInstance in class AbstractConversionFilter<SparseFloatVector,SparseFloatVector>
Parameters:
featureVector - Object to process

prepareComplete

protected void prepareComplete()
Description copied from class: AbstractConversionFilter
Complete the initialization phase

Overrides:
prepareComplete in class AbstractConversionFilter<SparseFloatVector,SparseFloatVector>

filterSingleObject

protected SparseFloatVector filterSingleObject(SparseFloatVector featureVector)
Description copied from class: AbstractConversionFilter
Normalize a single instance. You can implement this as UnsupportedOperationException if you override both public "normalize" functions!

Specified by:
filterSingleObject in class AbstractConversionFilter<SparseFloatVector,SparseFloatVector>
Parameters:
featureVector - Database object to normalize
Returns:
Normalized database object

restore

public SparseFloatVector restore(SparseFloatVector featureVector)
Description copied from interface: Normalization
Transforms a feature vector to the original attribute ranges.

Parameters:
featureVector - a feature vector to be transformed into original space
Returns:
a feature vector transformed into original space corresponding to the given feature vector

getInputTypeRestriction

protected SimpleTypeInformation<? super SparseFloatVector> getInputTypeRestriction()
Description copied from class: AbstractConversionFilter
Get the input type restriction used for negotiating the data query.

Specified by:
getInputTypeRestriction in class AbstractConversionFilter<SparseFloatVector,SparseFloatVector>
Returns:
Type restriction

Release 0.4.0 (2011-09-20_1324)