Parameterization in ELKI
ELKI includes a generic parameterization API that is used for the command line and graphical user interfaces.
Key features of this approach:
- Generic, re-usable parsers for parameters
- Generic constraint checking (valid value ranges etc.)
- Documentation of available options and valid values
- Uniform error reporting for parameter errors
- Input assistance for the user (constraints, required values, ...)
- Automatic UI generation (Dropdown menus, File selectors, ...)
- Automatic experiments
The API for this has gone through several iterations already, and it keeps on getting easier to use for the developers as well as more powerful for the user interface.
FAQs
- Why don't you just throw an exception on errors?
- This way, the first error will abort, and we cannot get a complete list of options. In order to set all parameters, a user may need many iterations to get everything right. Reporting multiple errors in one run is much more useful.
- Why don't you use introspection and inspect the class constructor?
- This is significantly slower, and we still don't have any parameter descriptions, constraints, etc. that we need for a good user interface. We are considering this as a fallback option, though.
- Why don't you just use String[] everywhere?
- This way, we cannot pass existing instances for parameters. Plus, when configuring internally, we can avoid lots of string operations this way.
- But I just want to call it from Java!
- Actually you become a bit less dependant on the other class when you use the parameterization API, since the addition of a new option with a default value will not affect you. ListParameterization is an easy to use API for this. However, forth generation API will allow you to use regular Java constructors. As examples for usage within Java, check the unit tests (e.g. /browser/trunk/test/de/lmu/ifi/dbs/elki/algorithm). An example containing many use-cases is /browser/trunk/test/de/lmu/ifi/dbs/elki/algorithm/AbstractSimpleAlgorithmTest.java.
- Why does this change every version?
- Because we're still discovering ways to make it easier to use. Our UIs are getting more powerful, and new applications require new functionality. This is why we are using 0.x version numbers! But by version 1.0 we're expecting to have something stable for you.
- What is this if (config.grab(param)) ... I see all the time?
- This serves two purposes. First of all, it adds the parameter to the configuration for the UIs. Secondly, when the method returns true, the parameter is guaranteed to have a valid value. This actually is a shorthand for config.grab(param); if (param.isDefined()) ...).
Forth generation parameterization
(NEW: This will be used by the 0.4 release - not available in ELKI 0.3)
Classes are now encouraged to offer a classic java constructor. Parameterization is delegated to an inner class derived from AbstractParameterizer (or Super.Parameterizer) that will read the parameters and then produce an instance.
Example
A benefit is that classes can now also return a subclass (this was already possible with static methods in 3.5 generation), as can be seen in this example from LPNormDistanceFunction
// NOTE: this is a STATIC INNER class of LPNormDistanceFunction! /** * Parameterization class. * * @author Erich Schubert */ public static class Parameterizer extends AbstractParameterizer { /** * The value of p. */ protected double p = 0.0; @Override protected void makeOptions(Parameterization config) { // While our superclass doesn't add any options, // we formally should always call the super class! super.makeOptions(config); // Add a parameter for the value of "p" DoubleParameter paramP = new DoubleParameter(P_ID); paramP.addConstraint(new GreaterConstraint(0)); if(config.grab(paramP)) { // The value of p is now available, otherwise, an error is reported p = paramP.getValue(); } } // This method will only be called when parameterization succeeded, // and thus "p" is defined. This is checked by the abstract class. @Override protected LPNormDistanceFunction makeInstance() { // Return optimized implementations for common values. if(p == 1.0) { return ManhattanDistanceFunction.STATIC; } if(p == 2.0) { return EuclideanDistanceFunction.STATIC; } if(p == Double.POSITIVE_INFINITY) { return MaximumDistanceFunction.STATIC; } return new LPNormDistanceFunction(p); } }
Note that when using the regular Java API, the optimization for Euclidean and Manhattan distances will not be used, since the constructor of LPNormDistanceFunction cannot just return the static instance of EuclideanDistanceFunction.
Third generation parameterization
(This is being used by the 0.3 release)
"Parameterizable" classes must have a constructor with the exact signature {{{ public ClassName(Parameterization config) }}} (classes with no parameter may also leave away the config parameter; the constructor must however still be public).
At a certain revision, alternatively a static method parameterize(Parameterization config) would be allowed that would serve as factory method to produce instances.
Example
A 0.3 constructor would look like this:
// Note, this constructor MUST be public and have this EXACT signature! /** * Constructor, adhering to * {@link de.lmu.ifi.dbs.elki.utilities.optionhandling.Parameterizable} * * @param config Parameterization */ public RGBHistogramQuadraticDistanceFunction(Parameterization config) { super(null); // This is needed to tell the user which parameters belong to which class config = config.descend(this); // Define a parameter - often done as class field. IntParameter BPP_PARAM = new IntParameter(BPP_ID, new GreaterConstraint(0)); // Configure the parameter if(config.grab(BPP_PARAM)) { int bpp = BPP_PARAM.getValue(); [... process parameter ...] } // if parameterization failed, the class would remain in an undefined state. // it's the callers responsibility to check for parameter errors in "config". } }
Second generation parameterization
An interim version of first and third generation, that would still use strings a lot, but already use the standardized constructors.
First generation parameterization
(This is being used by the 0.1 release)
The methods setParameters(String []) and getParameters(String []) were called to configure a class and would throw an exception on error. Not calling setParameters would result in undefined behavior. This is very similar to the way WEKA still does it.