Package de.lmu.ifi.dbs.elki.utilities.optionhandling

Parameter handling and option descriptions.

See: Description


Package de.lmu.ifi.dbs.elki.utilities.optionhandling Description

Parameter handling and option descriptions.

  1. Option ID: Any parameter must have an OptionID.
    These are Singleton objects to uniquely identify the option. They should be "public static".
    The OptionID specifies the parameter name and a generic description.

    Example code:

    // Defining option IDs
     public static final OptionID DISTANCE_FUNCTION_ID =
       OptionID.getOrCreateOptionID(
         "algorithm.distancefunction",
         "Distance function to determine the distance between database objects."
       ); 
     
    (This example is from AbstractDistanceBasedAlgorithm.)
  2. Parameter Object: To obtain a value, you must use a Parameter object.
    Parameter objects handle parsing of values into the desired type, and various subclasses for common types are provided. It is not desirable to subclass these types too much, since a UI should be able to offer content assistance for input.

    Parameters often have types and constraints attached to them, and may be flagged optional or have a default value. Note that a parameter with a default value is by definition optional, so there is no constructor with both a default value and the optional flag.

    Due to restrictions imposed by Java Generics, ListParameter based types do not have the full set of constructors, since a List of Constraints and a List of Default values produce the same signature. In such a signature conflict situation, you can use a full constructor either by giving null as constraints (and a list of default values) or by giving constraints and setting optional to false.

    Notice the difference between an ObjectParameter and a ClassParameter. The first is meant to carry a single object (which as fallback can be initialized with a class, resulting in a new object of that class), the second is meant for a "object factory" use.

    Example code:

    // Defining Parameters
     final ObjectParameter<DistanceFunction<O, D>> DISTANCE_FUNCTION_PARAM =
       new ObjectParameter<DistanceFunction<O, D>>(
         DISTANCE_FUNCTION_ID,
         DistanceFunction.class,
         EuclideanDistanceFunction.class
       ); 
     
    (This example is from DistanceBasedAlgorithm.)
  3. Initialization: Initialization happens in the constructor, which must have the signature Class(Parameterization config) or using a static method parameterize(Parameterization config).
    The config object manages configuration data, whichever source it is coming from (e.g. command line, XML, lists, ...)

    The Parameterization class offers the method grab , which returns true when the parameter value is defined and satisfies the given constraints.

    Initialization should happen in a delayed-fail way. Failure is managed by the Parameterization object, and not failing immediately allows for reporting all configuration errors (and all options) in a single run. As such, when reporting a configuration error, you should not throw the error, but instead call reportError and leave error handling to the Parameterization class. Note that this method will return eventually, so you might need to use try-catch-report blocks.

    The static parameterize(Parameterization config) factory method may return null when Parameterization failed. Otherwise, it must return an instance of the given class or a subclass. Example: LPNormDistanceFunction returns an instance of EuclideanDistance for p=2.

    When writing constructors, try to make error handling as local as possible, to report as many errors at once as possible.

    Example code:

    // Getting parameters
     protected DistanceBasedAlgorithm(Parameterization config) {
       super(config);
       if (config.grab(DISTANCE_FUNCTION_PARAM)) {
         distanceFunction = DISTANCE_FUNCTION_PARAM.instantiateClass(config);
       }
     }
     
    (This example is from DistanceBasedAlgorithm.)

    // Using flags
     protected AbstractApplication(Parameterization config) {
       super(config);
       if(config.grab(VERBOSE_FLAG)) {
         verbose = VERBOSE_FLAG.getValue();
       }
     }
     
    (This example is from AbstractApplication.)

    The if config.grab statement ensures that the parameter was set. Note that the configuration manager is passed on to the child instance.

  4. Compound conditionals: Sometimes, more than one parameter value is required. However, config.grab(...) && config.grab(...) is not working as intended, since a negative results in the first config.grab statement will prevent the evaluation of the second. Instead, the following code should be used:
    // Compound parameter dependency
     config.grab(FIRST_OPTION);
     config.grab(SECOND_OPTION);
     if (FIRST_OPTION.isDefined() && SECOND_OPTION.isDefined()) {
       // Now we have validated values for both available.
     }
     
  5. Global Constraints: additional global constraints can be added using checkConstraint

    Example code:

    // Global constraints
     config.grab(NORMALIZATION_PARAM);
     config.grab(NORMALIZATION_UNDO_FLAG);
     GlobalParameterConstraint gpc =
       new ParameterFlagGlobalConstraint<Class<?>, Class<? extends Normalization<O>>>(
         NORMALIZATION_PARAM, null,
         NORMALIZATION_UNDO_FLAG, true);
     if (config.checkConstraint(gpc)) {
       // Code that depends on the constraints being satisfied.
     }
     
    (This example is from KDDTask.)

    TODO: Much of the constraint functionality can be solved much easier by direct Java code and reportError. Unless the constraints can be used by a GUI for input assistance, we should consider replacing them with direct code.

  6. Error reporting:
    // Proper dealing with errors
     try {
       // code that might fail with an IO exception
     } except(IOException e) {
       config.reportError(new WrongParameterValueException(...));
     }
     // process remaining parameters, to report additional errors. 
     
  7. Command line parameterization: Command line parameters are handled by the class SerializedParameterization which provided convenient constructors from String arrays:
    // Use command line parameters
     SerializedParameterization params = new SerializedParameterization(args);
     
    (This example is from AbstractApplication.)
  8. Internal Parameterization: Often one algorithm will need to call another algorithm, with specific parameters. ListParameterization offers convenience function for this that do not require String serialization.
    // Internal parameterization
     ListParameterization parameters = new ListParameterization();
    
     parameters.addParameter(PCAFilteredRunner.PCA_EIGENPAIR_FILTER, FirstNEigenPairFilter.class);
     parameters.addParameter(FirstNEigenPairFilter.EIGENPAIR_FILTER_N, correlationDimension);
     
    (This example is from ERiC.)
  9. Combined parameterization: Sometimes, an algorithm will pre-define some parameters, while additional parameters can be supplied by the user. This can be done using a chained parameterization as provided by ChainedParameterization
    // predefine some parameters
     ListParameterization opticsParameters = new ListParameterization();
     opticsParameters.addParameter(OPTICS.DISTANCE_FUNCTION_ID, DiSHDistanceFunction.class);
     // ... more parameters ...
     ChainedParameterization chain = new ChainedParameterization(opticsParameters, config);
     chain.errorsTo(opticsParameters);
     optics = new OPTICS<V, PreferenceVectorBasedCorrelationDistance>(chain);
     opticsParameters.failOnErrors();
     
    (This example code is from DiSH.)

    Note how error handling is performed by explicity specification of an error target and by calling failOnErrors() at the end of parameterization.

    (Note: the current implementation of this approach may be inadequate for XML or Tree based parameterization, due to tree inconsistencies. This is an open TODO issue)

  10. Tracking parameterizations:: Sometimes (e.g. for help functions, re-running, configuration templates etc.) it is required to track all parameters an (sub-) algorithm consumed. This can be done using a TrackParameters wrapper around the configuration. The wrapper does not have own configuration items or error recording, instead everything is forwarded to the inner configuration. It does however keep track of consumed values, that can then be used for re-parameterization of an Algorithm.
    // config is an existing parameterization
     TrackParameters trackpar = new TrackParameters(config);
     Database<V> tmpDB = PARTITION_DB_PARAM.instantiateClass(trackpar);
     Collection<Pair<OptionID, Object>> dbpars = trackpar.getGivenParameters();
     
    (This is an example from COPAC.)
  11. Advanced tracking: When parameterizing a sub-algorithm, it can be useful to provide some parameters that should not be tracked (because the actual values will only be available afterwards). This is possible by using a ChainedParameterization of untracked and tracked values.

    Example:

    // config is an existing parameterization
     ListParameterization myconfig = new ListParameterization();
     // dummy values for input and output
     myconfig.addParameter(INPUT_ID, "/dev/null");
     myconfig.addParameter(OUTPUT_ID, "/dev/null");      
     TrackParameters track = new TrackParameters(config);
     ChainedParameterization chain = new ChainedParameterization(myconfig, track);
     wrapper = WRAPPER_PARAM.instantiateClass(chain);
     

For documentation, the classes should also be annotated with Title Description and Reference (where possible).

ELKI Version 0.7.0

Copyright © 2015 ELKI Development Team, Lehr- und Forschungseinheit für Datenbanksysteme, Ludwig-Maximilians-Universität München. License information.