Benchmarking with ELKI
Setup
For benchmarking, do NOT set -verbose as the verbose output will often come at cost in computation time, and verbosity varies a lot from algorithm to algorithm.
Versions up to 0.3: set the -time parameter to obtain timing information.
Version 0.4: to get algorithm timing information, you can use:
-enableDebug de.lmu.ifi.dbs.elki.workflow.AlgorithmStep=INFO
This enables verbose logging only for the wrapper class that invokes the main algorithms.
Fair benchmarking
ELKI much like any other system (in particular Java based systems) comes with a certain performance cost compared to highly optimized code. An optimized C implementation will beat any ELKI algorithm by an order of magnitude. Results based on entirely independent implementations are therefore usually misleading.
When comparing different algorithms you must therefore
- ensure they are implemented in the same language and level of optimization
- they use the same backing features (database layer etc.)
- use the same level of abstraction/modularization (e.g. modular distance functions as opposed to hard-coded euclidean distance)
- are properly setup within the framework (e.g. try different index structures, use the best each)
Many "reference" algorithms such as DBSCAN benefit immensely of index structures in low dimensional spaces and large data sets!
Therefore, we strongly encourage you to implement all comparison methods in ELKI, too. This is essentially the only way to ensure fair comparisons. If you have found a bottleneck, send us a Patch to improve it for everyone!
Of course we try to make ELKI as fast as possible (but we will not sacrifice extendability or code readability for this).
We also strongly encourage reviewers to pay attention to fair benchmarks, and outright reject papers when these are not done in a sound way. Unfortunately, many articles do lack when it comes to fair experiments.
Performance of ELKI versions
If you are not convinced, here is a proof that implementation differences can have a big impact. We will compare ELKI with ELKI, just using different versions.
For this benchmark we use the ALOI image data set, which has 110.250 objects, with just 8 dimensions (so the algorithms, distance functions and index structures are supposed okay). We tested four workloads, each with and without an R*-Tree index. The first includes only the startup and parsing time, the second invokes the dummy algorithm, which performs a 10NN-query for each object. Third, we run OPTICS as and advanced clustering algorithm and fourth we run LOF, a state of the art outlier detection algorithm.
| Task | ELKI 0.2 | ELKI 0.3 | ELKI 0.4 | ELKI 0.5pre |
| Load only | 15.1 | 19.3 | 3.9 | 4.0 |
| Load into R*-Tree | 26.6 | 26.8 | 12.1 | 9.9 |
| 10NN queries each | 3033.5 | 2854.7 | 1527.4 | 1511.4 |
| 10NN with R*-Tree | 448.3 | 365.5 | 275.4 | 165.6 |
| OPTICS clustering | 3086.4 | 2899.3 | 1506.1 | 1503.4 |
| OPTICS with R*-Tree | 776.7 | 650.1 | 389.0 | 326.0 |
| DBSCAN clustering | 3076.2 | 3090.8 | 1442.6 | 1446.5 |
| DBSCAN with R*-Tree | 476.6 | 374.2 | 234.6 | 219.7 |
| Local Outlier Factor | 2747.4 | 2627.6 | 1536.1 | 1475.9 |
| LOF with R*-Tree | 563.0 | 481.0 | 365.4 | 208.1 |
| k-means lloyd k=100 | 138.3 | 115.5 | 96.8 | 60.0 |
As you can see, ELKI 0.3 had a slightly more expensive parser, but was already slightly faster than ELKI 0.2. In ELKI 0.4 we see major performance gains, which we attribute to removing Java-boxing and unboxing in various places.
However, we cannot remove all boxing/unboxing in Java without losing much of the generality of ELKI. We expect significant performance gains to be possible with a low-level C implementation. (Which are, however, not of scientific interest to us.) Therefore, benchmarking ELKI against implementations in C (or Matlab, or any other language that has high-performant mathematical libraries embedded) is not sound!