@Reference(authors="M. J. Zaki, S. Parthasarathy, M. Ogihara, W. Li", title="New Algorithms for Fast Discovery of Association Rules", booktitle="Proc. 3rd ACM SIGKDD \'97 Int. Conf. on Knowledge Discovery and Data Mining", url="http://www.aaai.org/Library/KDD/1997/kdd97-060.php", bibkey="DBLP:conf/kdd/ZakiPOL97") public class Eclat extends AbstractFrequentItemsetAlgorithm
Eclat discovers frequent itemsets by first transforming the data into a (sparse) column-oriented form, then performing a depth-first traversal of the prefix lattice, stopping traversal when the minimum support is no longer satisfied.
This implementation is the basic algorithm only, and does not use diffsets. Columns are represented using a sparse representation, which theoretically is beneficial when the density is less than 1/31. This corresponds roughly to a minimum support of 3% for 1-itemsets. When searching for itemsets with a larger minimum support, it may be desirable to use a dense bitset representation instead and/or implement an automatic switching technique!
Performance of this implementation is probably surpassed with a low-level C implementation based on SIMD bitset operations as long as support of an itemset is high, which are not easily accessible in Java.
Reference:
New Algorithms for Fast Discovery of Association Rules
M. J. Zaki, S. Parthasarathy, M. Ogihara, W. Li
Proc. 3rd ACM SIGKDD '97 Int. Conf. on Knowledge Discovery and Data Mining
Modifier and Type | Class and Description |
---|---|
static class |
Eclat.Parameterizer
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
private static Logging |
LOG
Class logger.
|
private static java.lang.String |
STAT
Prefix for statistics.
|
maxlength, minlength
ALGORITHM_ID
Constructor and Description |
---|
Eclat(double minsupp,
int minlength,
int maxlength)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private DBIDs[] |
buildIndex(Relation<BitVector> relation,
int dim,
int minsupp) |
private void |
extractItemsets(DBIDs[] idx,
int start,
int minsupp,
java.util.List<Itemset> solution) |
private void |
extractItemsets(DBIDs iset,
DBIDs[] idx,
int[] buf,
int depth,
int start,
int minsupp,
java.util.List<Itemset> solution) |
TypeInformation[] |
getInputTypeRestriction()
Get the input type restriction used for negotiating the data query.
|
protected Logging |
getLogger()
Get the (STATIC) logger for this class.
|
private DBIDs |
mergeJoin(DBIDs first,
DBIDs second) |
FrequentItemsetsResult |
run(Database db,
Relation<BitVector> relation)
Run the Eclat algorithm
|
getMinimumSupport
run
private static final Logging LOG
private static final java.lang.String STAT
public Eclat(double minsupp, int minlength, int maxlength)
minsupp
- Minimum supportminlength
- Minimum lengthmaxlength
- Maximum lengthpublic FrequentItemsetsResult run(Database db, Relation<BitVector> relation)
db
- Database to processrelation
- Bit vector relationprivate void extractItemsets(DBIDs[] idx, int start, int minsupp, java.util.List<Itemset> solution)
private void extractItemsets(DBIDs iset, DBIDs[] idx, int[] buf, int depth, int start, int minsupp, java.util.List<Itemset> solution)
public TypeInformation[] getInputTypeRestriction()
AbstractAlgorithm
getInputTypeRestriction
in interface Algorithm
getInputTypeRestriction
in class AbstractAlgorithm<FrequentItemsetsResult>
protected Logging getLogger()
AbstractAlgorithm
getLogger
in class AbstractAlgorithm<FrequentItemsetsResult>
Copyright © 2019 ELKI Development Team. License information.