Aha Instance-based series (IBL)

Aha-ib is an external inducer that interfaces the IB1-4 series from 3/9/94 []. IB_CLASS should be set to one of the following values: ib1, ib2, ib3, or ib4. The seed and specific flags can be set in the options IBL_SEED and IBL_FLAGS. The executable ``ibl'' must be in the current path.

IBL is a research system and is not very robust. It does not check when its limits are exceeded and sometimes goes out of bounds on arrays, corrupting memory and usually core dumping. If it crashes, the most probable cause is that some constant in datastructures.h is too small. In the version distributed with MLC++ we have increased the limits to 200 attributes and 10,000 instances.

There are some problems that we have discovered when trying to IBL on many datafiles:

  1. If the files are too small (few instances), the test set accuracy is not reported ( e.g. , soybean-small).

  2. The program probably leaks memory. It required more than 150MB for the mushroom dataset.

  3. It does not handle spaces in attributes values, which can cause problems in some files (this could be taken care of in the MLC++ conversion code, but it is very rare so we do not handle it yet).

Contact David Aha ( with questions, problems, and requests for the source code.

Ronny Kohavi
Sun Oct 6 23:17:50 PDT 1996