@Reference(authors="D. Guo", title="Coordinating computational and visual approaches for interactive feature selection and multivariate clustering", booktitle="Information Visualization, 2(4)", url="http://dx.doi.org/10.1057/palgrave.ivs.9500053") public class MCEDimensionSimilarity extends Object implements DimensionSimilarity<NumberVector<?>>
D. Guo
Coordinating computational and visual approaches for interactive feature
selection and multivariate clustering
Information Visualization, 2(4), 2003.
Modifier and Type | Class and Description |
---|---|
static class |
MCEDimensionSimilarity.Parameterizer
Parameterization class.
|
Modifier and Type | Field and Description |
---|---|
static MCEDimensionSimilarity |
STATIC
Static instance.
|
static int |
TARGET
Desired size: 35 observations.
|
Modifier | Constructor and Description |
---|---|
protected |
MCEDimensionSimilarity()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
private ArrayList<ArrayList<DBIDs>> |
buildPartitions(Relation<? extends NumberVector<?>> relation,
DBIDs ids,
int depth,
DimensionSimilarityMatrix matrix)
Calculates "index structures" for every attribute, i.e. sorts a
ModifiableArray of every DBID in the database for every dimension and
stores them in a list.
|
void |
computeDimensionSimilarites(Database database,
Relation<? extends NumberVector<?>> relation,
DBIDs subset,
DimensionSimilarityMatrix matrix)
Compute the dimension similarity matrix
|
private void |
divide(DBIDArrayIter it,
double[] data,
ArrayList<DBIDs> idx,
int start,
int end,
int depth,
Mean mean)
Recursive call to further subdivide the array.
|
private double |
getMCEntropy(int[][] mat,
int[] psizesx,
int[] psizesy,
int size,
int gridsize,
double loggrid)
Compute the MCE entropy value.
|
private void |
intersectionMatrix(int[][] res,
ArrayList<? extends DBIDs> partsx,
ArrayList<? extends DBIDs> partsy,
int gridsize)
Intersect the two 1d grid decompositions, to obtain a 2d matrix.
|
public static final MCEDimensionSimilarity STATIC
public static final int TARGET
protected MCEDimensionSimilarity()
public void computeDimensionSimilarites(Database database, Relation<? extends NumberVector<?>> relation, DBIDs subset, DimensionSimilarityMatrix matrix)
DimensionSimilarity
computeDimensionSimilarites
in interface DimensionSimilarity<NumberVector<?>>
database
- Database contextrelation
- Relationsubset
- DBID subset (for sampling / selection)matrix
- Matrix to fillprivate void intersectionMatrix(int[][] res, ArrayList<? extends DBIDs> partsx, ArrayList<? extends DBIDs> partsy, int gridsize)
res
- Output matrix to fillpartsx
- Partitions in first componentpartsy
- Partitions in second component.gridsize
- Size of partition decompositionprivate double getMCEntropy(int[][] mat, int[] psizesx, int[] psizesy, int size, int gridsize, double loggrid)
mat
- Partition size matrixpsizesx
- Partition sizes on Xpsizesy
- Partition sizes on Ysize
- Data set sizegridsize
- Size of gridsloggrid
- Logarithm of grid sizes, for normalizationprivate ArrayList<ArrayList<DBIDs>> buildPartitions(Relation<? extends NumberVector<?>> relation, DBIDs ids, int depth, DimensionSimilarityMatrix matrix)
relation
- Relation to indexids
- IDs to usematrix
- Matrix for dimension informationprivate void divide(DBIDArrayIter it, double[] data, ArrayList<DBIDs> idx, int start, int end, int depth, Mean mean)
it
- Iterator (will be reset!)data
- 1D data, sortedidx
- Output indexstart
- Interval startend
- Interval enddepth
- Depthmean
- Mean working variable (will be reset!)