@Title(value="APRIORI: Algorithm for Mining Association Rules") @Description(value="Searches for frequent itemsets") @Reference(authors="R. Agrawal, R. Srikant", title="Fast Algorithms for Mining Association Rules", booktitle="Proc. 20th Int. Conf. on Very Large Data Bases (VLDB \'94)", url="http://www.vldb.org/conf/1994/P487.PDF", bibkey="DBLP:conf/vldb/AgrawalS94") @Alias(value="de.lmu.ifi.dbs.elki.algorithm.APRIORI") public class APRIORI extends AbstractFrequentItemsetAlgorithm
Reference:
R. Agrawal, R. Srikant
Fast Algorithms for Mining Association Rules
In Proc. 20th Int. Conf. on Very Large Data Bases (VLDB '94)
This implementation uses some simple optimizations for 1- and 2-itemsets, but does not implement the original hash tree (yet, please contribute).
Note: this algorithm scales well to a large number of transactions, but not so well to a large number of frequent itemsets (items). For best results, use domain-specific preprocessing to aggregate items into groups. Use statistics logging to keep track of candidate set sizes.
Modifier and Type | Class and Description |
---|---|
static class |
APRIORI.Parameterizer
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
The logger for this class.
|
private java.lang.String |
STAT
Statistics logging prefix.
|
maxlength, minlength
ALGORITHM_ID
Constructor and Description |
---|
APRIORI(double minfreq)
Constructor with minimum frequency.
|
APRIORI(double minfreq,
int minlength,
int maxlength)
Constructor with minimum frequency.
|
Modifier and Type | Method and Description |
---|---|
protected java.util.List<Itemset> |
aprioriGenerate(java.util.List<? extends Itemset> supported,
int length,
int dim)
Prunes a given set of candidates to keep only those BitSets where all
subsets of bits flipping one bit are frequent already.
|
private int |
binarySearch(java.util.List<SparseItemset> candidates,
SparseItemset scratch,
int begin,
int end)
Binary-search for the next-larger element.
|
protected java.util.List<OneItemset> |
buildFrequentOneItemsets(Relation<? extends SparseFeatureVector<?>> relation,
int dim,
int needed)
Build the 1-itemsets.
|
protected java.util.List<SparseItemset> |
buildFrequentTwoItemsets(java.util.List<OneItemset> oneitems,
Relation<BitVector> relation,
int dim,
int needed,
DBIDs ids,
ArrayModifiableDBIDs survivors)
Build the 2-itemsets.
|
private java.lang.StringBuilder |
debugDumpCandidates(java.lang.StringBuilder msg,
java.util.List<? extends Itemset> candidates,
VectorFieldTypeInformation<BitVector> meta)
Debug method: output all itemsets.
|
protected java.util.List<? extends Itemset> |
frequentItemsets(java.util.List<? extends Itemset> candidates,
Relation<BitVector> relation,
int needed,
DBIDs ids,
ArrayModifiableDBIDs survivors,
int length)
Returns the frequent BitSets out of the given BitSets with respect to the
given database.
|
protected java.util.List<SparseItemset> |
frequentItemsetsSparse(java.util.List<SparseItemset> candidates,
Relation<BitVector> relation,
int needed,
DBIDs ids,
ArrayModifiableDBIDs survivors,
int length)
Returns the frequent BitSets out of the given BitSets with respect to the
given database.
|
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private boolean |
initializeSearchItemset(BitVector bv,
int[] scratchi,
int[] iters)
Initialize the scratch itemset.
|
private boolean |
nextSearchItemset(BitVector bv,
int[] scratchi,
int[] iters)
Advance scratch itemset to the next.
|
FrequentItemsetsResult |
run(Relation<BitVector> relation)
Performs the APRIORI algorithm on the given database.
|
getMinimumSupport
run
private static final Logging LOG
private final java.lang.String STAT
public APRIORI(double minfreq, int minlength, int maxlength)
minfreq
- Minimum frequencyminlength
- Minimum lengthmaxlength
- Maximum lengthpublic APRIORI(double minfreq)
minfreq
- Minimum frequencypublic FrequentItemsetsResult run(Relation<BitVector> relation)
relation
- the Relation to processprotected java.util.List<OneItemset> buildFrequentOneItemsets(Relation<? extends SparseFeatureVector<?>> relation, int dim, int needed)
relation
- Data relationdim
- Maximum dimensionalityneeded
- Minimum support neededprotected java.util.List<SparseItemset> buildFrequentTwoItemsets(java.util.List<OneItemset> oneitems, Relation<BitVector> relation, int dim, int needed, DBIDs ids, ArrayModifiableDBIDs survivors)
oneitems
- Frequent 1-itemsetsrelation
- Data relationdim
- Maximum dimensionalityneeded
- Minimum support neededids
- Objects to processsurvivors
- Output: objects that had at least two 1-frequent items.protected java.util.List<Itemset> aprioriGenerate(java.util.List<? extends Itemset> supported, int length, int dim)
supported
- Support maplength
- Itemset lengthdim
- Dimensionalityprotected java.util.List<? extends Itemset> frequentItemsets(java.util.List<? extends Itemset> candidates, Relation<BitVector> relation, int needed, DBIDs ids, ArrayModifiableDBIDs survivors, int length)
candidates
- the candidates to be evaluatedrelation
- the database to evaluate the candidates onneeded
- Minimum support neededids
- Objects to processsurvivors
- Output: objects that had at least two 1-frequent items.length
- Itemset lengthprotected java.util.List<SparseItemset> frequentItemsetsSparse(java.util.List<SparseItemset> candidates, Relation<BitVector> relation, int needed, DBIDs ids, ArrayModifiableDBIDs survivors, int length)
candidates
- the candidates to be evaluatedrelation
- the database to evaluate the candidates onneeded
- Minimum support neededids
- Objects to processsurvivors
- Output: objects that had at least two 1-frequent items.length
- Itemset lengthprivate boolean initializeSearchItemset(BitVector bv, int[] scratchi, int[] iters)
bv
- Bit vector data sourcescratchi
- Scratch itemsetiters
- Iterator arraytrue
if the itemset had minimum lengthprivate boolean nextSearchItemset(BitVector bv, int[] scratchi, int[] iters)
bv
- Bit vector data sourcescratchi
- Scratch itemsetiters
- Iterator arraytrue
if the itemset had minimum lengthprivate int binarySearch(java.util.List<SparseItemset> candidates, SparseItemset scratch, int begin, int end)
candidates
- Candidates to search forscratch
- Scratch spacebegin
- Search interval beginend
- Search interval endprivate java.lang.StringBuilder debugDumpCandidates(java.lang.StringBuilder msg, java.util.List<? extends Itemset> candidates, VectorFieldTypeInformation<BitVector> meta)
msg
- Output buffercandidates
- Itemsets to dumpmeta
- Metadata for item labelspublic TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<FrequentItemsetsResult>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<FrequentItemsetsResult>
Copyright © 2019 ELKI Development Team. License information.