FPGrowth (ELKI: Environment for DeveLoping KDD-Applications Supported by Index-Structures)

java.lang.Object
- de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm<FrequentItemsetsResult>
- - de.lmu.ifi.dbs.elki.algorithm.itemsetmining.AbstractFrequentItemsetAlgorithm
  - - de.lmu.ifi.dbs.elki.algorithm.itemsetmining.FPGrowth

All Implemented Interfaces:

Algorithm
```
@Reference(authors="J. Han, J. Pei, Y. Yin",
           title="Mining frequent patterns without candidate generation",
           booktitle="Proc. ACM SIGMOD Int. Conf. Management of Data (SIGMOD 2000)",
           url="https://doi.org/10.1145/342009.335372",
           bibkey="DBLP:conf/sigmod/HanPY00")
 @Priority(value=200)
public class FPGrowth
extends AbstractFrequentItemsetAlgorithm
```
FP-Growth is an algorithm for mining the frequent itemsets by using a compressed representation of the database called FPGrowth.FPTree.
FP-Growth first sorts items by the overall frequency, since having high frequent items appear first in the tree leads to a much smaller tree since frequent subsets will likely share the same path in the tree. FP-Growth is beneficial when you have a lot of (near-) duplicate transactions, and are using a not too high support threshold, as it only prunes single items, not item combinations.
This implementation is in-memory only, and has not yet been carefully optimized.
The worst case memory use probably is \(O(\min(n\cdot l,i^l))\) where i is the number of items, l the average itemset length, and n the number of items. The worst case scenario is when every item is frequent, and every transaction is unique. The resulting tree will then be larger than the original data.
Reference:
J. Han, J. Pei, Y. Yin
Mining frequent patterns without candidate generation
In Proc. ACM SIGMOD Int. Conf. Management of Data (SIGMOD 2000)

Since:

0.7.0

Author:

Erich Schubert

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`FPGrowth.FPNode` A single node of the FP tree.
`static class`	`FPGrowth.FPTree` FP-Tree data structure
`static class`	`FPGrowth.Parameterizer` Parameterization class.

Field Summary

Fields
Modifier and Type Field and Description

private static Logging LOG
Class logger.

private static java.lang.String STAT
Prefix for statistics.
- Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.itemsetmining.AbstractFrequentItemsetAlgorithm
  maxlength, minlength
- Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
  ALGORITHM_ID

Fields
Modifier and Type	Field and Description
`private static Logging`	`LOG` Class logger.
`private static java.lang.String`	`STAT` Prefix for statistics.

Constructor Summary

Constructors
Constructor and Description

FPGrowth(double minsupp, int minlength, int maxlength)
Constructor.

Constructors
Constructor and Description
`FPGrowth(double minsupp, int minlength, int maxlength)` Constructor.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`private FPGrowth.FPTree`	`buildFPTree(Relation<BitVector> relation, int[] iidx, int items)` Build the actual FP-tree structure.
`private int[]`	`buildIndex(int[] counts, int[] positions, int minsupp)` Build a forward map, item id (dimension) to frequency position
`private int[]`	`countItemSupport(Relation<BitVector> relation, int dim)` Count the support of each 1-item.
`TypeInformation[]`	`getInputTypeRestriction()` Get the input type restriction used for negotiating the data query.
`protected Logging`	`getLogger()` Get the (STATIC) logger for this class.
`FrequentItemsetsResult`	`run(Database db, Relation<BitVector> relation)` Run the FP-Growth algorithm

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.itemsetmining.AbstractFrequentItemsetAlgorithm
getMinimumSupport

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm
run

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - LOG
```
private static final Logging LOG
```
    Class logger.
  - STAT
```
private static final java.lang.String STAT
```
    Prefix for statistics.
- Constructor Detail
  - FPGrowth
```
public FPGrowth(double minsupp,
                int minlength,
                int maxlength)
```
    Constructor.
    
    Parameters:
    
    minsupp - Minimum support (relative or absolute)
    
    minlength - Minimum length
    
    maxlength - Maximum length
- Method Detail
  - run
```
public FrequentItemsetsResult run(Database db,
                                  Relation<BitVector> relation)
```
    Run the FP-Growth algorithm
    
    Parameters:
    
    db - Database to process
    
    relation - Bit vector relation
    
    Returns:
    
    Frequent patterns found
  - countItemSupport
```
private int[] countItemSupport(Relation<BitVector> relation,
                               int dim)
```
    Count the support of each 1-item.
    
    Parameters:
    
    relation - Data
    
    dim - Maximum dimensionality
    
    Returns:
    
    Item counts
  - buildFPTree
```
private FPGrowth.FPTree buildFPTree(Relation<BitVector> relation,
                                    int[] iidx,
                                    int items)
```
    Build the actual FP-tree structure.
    
    Parameters:
    
    relation - Data
    
    iidx - Inverse index (dimension to item rank)
    
    items - Number of items
    
    Returns:
    
    FP-tree
  - buildIndex
```
private int[] buildIndex(int[] counts,
                         int[] positions,
                         int minsupp)
```
    Build a forward map, item id (dimension) to frequency position
    
    Parameters:
    
    counts - Item counts
    
    positions - Position index (output)
    
    minsupp - Minimum support
    
    Returns:
    
    Forward index
  - getInputTypeRestriction
```
public TypeInformation[] getInputTypeRestriction()
```
    Description copied from class: AbstractAlgorithm
    
    Get the input type restriction used for negotiating the data query.
    
    Specified by:
    
    getInputTypeRestriction in interface Algorithm
    
    Specified by:
    
    getInputTypeRestriction in class AbstractAlgorithm<FrequentItemsetsResult>
    
    Returns:
    
    Type restriction
  - getLogger
```
protected Logging getLogger()
```
    Description copied from class: AbstractAlgorithm
    
    Get the (STATIC) logger for this class.
    
    Specified by:
    
    getLogger in class AbstractAlgorithm<FrequentItemsetsResult>
    
    Returns:
    
    the static logger

Class FPGrowth

Nested Class Summary

Field Summary

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.itemsetmining.AbstractFrequentItemsetAlgorithm

Fields inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm

Constructor Summary

Method Summary

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.itemsetmining.AbstractFrequentItemsetAlgorithm

Methods inherited from class de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm

Methods inherited from class java.lang.Object

Field Detail

LOG

STAT

Constructor Detail

FPGrowth

Method Detail

run

countItemSupport

buildFPTree

buildIndex

getInputTypeRestriction

getLogger