See: Description
Interface | Description |
---|---|
InterestingnessMeasure |
Interface for interestingness measures.
|
Class | Description |
---|---|
AddedValue |
Added value (AV) interestingness measure:
\( \text{confidence}(X \rightarrow Y) - \text{support}(Y) = P(Y|X)-P(Y) \).
|
CertaintyFactor |
Certainty factor (CF; Loevinger) interestingness measure.
\( \tfrac{\text{confidence}(X \rightarrow Y) -
\text{support}(Y)}{\text{support}(\neg Y)} \).
|
Confidence |
Confidence interestingness measure,
\( \tfrac{\text{support}(X \cup Y)}{\text{support}(X)}
= \tfrac{P(X \cap Y)}{P(X)}=P(Y|X) \).
|
Conviction |
Conviction interestingness measure:
\(\frac{P(X) P(\neg Y)}{P(X\cap\neg Y)}\).
|
Cosine |
Cosine interestingness measure,
\(\tfrac{\text{support}(A\cup B)}{\sqrt{\text{support}(A)\text{support}(B)}}
=\tfrac{P(A\cap B)}{\sqrt{P(A)P(B)}}\).
|
GiniIndex |
Gini-index based interestingness measure, using the weighted squared
conditional probabilities compared to the non-conditional priors.
|
Jaccard |
Jaccard interestingness measure:
\[\tfrac{\text{support}(A \cup B)}{\text{support}(A \cap B)}
=\tfrac{P(A \cap B)}{P(A)+P(B)-P(A \cap B)}
=\tfrac{P(A \cap B)}{P(A \cup B)}\]
Reference:
P.
|
JMeasure |
J-Measure interestingness measure.
|
Klosgen |
Klösgen interestingness measure.
|
Leverage |
Leverage interestingness measure.
|
Lift |
Lift interestingness measure.
|
Much of the confusion with these measures arises from the anti-monotonicity of itemsets, which are omnipresent in the literature.
In the itemset notation, the itemset \(X\) denotes the set of matching transactions \(\{T|X\subseteq T\}\) that contain the itemset \(X\). If we enlarge \(Z=X\cup Y\), the resulting set shrinks: \(\{T|Z\subseteq T\}=\{T|X\subseteq T\}\cap\{T|Y\subseteq T\}\).
Because of this: \(\text{support}(X\cup Y) = P(X \cap Y)\) and \(\text{support}(X\cap Y) = P(X \cup Y)\). With "support" and "confidence", it is common to see the reversed semantics (the union on the constraints is the intersection on the matches, and conversely); with probabilities it is common to use "events" as in frequentist inference.
To make things worse, the "support" is sometimes in absolute (integer) counts, and sometimes used in a relative share.
Copyright © 2019 ELKI Development Team. License information.