SGI

The Trusted Leader in High Performance Computing

 
MLC++, Color GLDs

COLOR GLDs (General Logic Diagrams) are now available in MLC++. Please note that they will print greyscale on most printers, but may not show well on non-color monitors.

Since the full instance space is too big for most datasets, the GLDs can show AND train an induction algorithm on a projected space. A good projection was found by using a feature subset selection algorithm written in MLC++.

For each dataset, we show the training set (usually O's and X's) over the predicted space, and then the test set over the predicted space. The different colors denote the different classes. For the test set, O's mark the fact that all test instances that fell in the box were correctly predicted, X's mark the fact that there was at least one mistake.

  • Zoo: Predict the class of an animal. There are seven classes.

    Comments: Projecting the space almost "solves" the problem. Most instances in the projected space match in their label, so this becomes an "easy" problem. ID3's accuracy on the original space: 88.2% (hence it improves on the projected space).


  • Monk1: Predict whether the robot has jacket color read, or head-shape = body-shape.

    Comments: Three irrelevant features. The problem becomes trivial when projected over the three relevant features, a subset that the MLC++ feature subset selector easily finds. We show the full space because we can show it.


  • Parity: The concept is the parity of 5 bits (Bits 1,2,3,5,7), with 5 irrelevant bits.

    Comments: The concept is considered very hard, and very unnatural. However, given the 5 relevant bits, all 32 instances are given in the training set.


  • Chess: King rook vs. king pawn on a7. The concept is whether white can win.

    Comments: Space has 33 boolean features and one ternary feature, for a total of 25,769,803,776 possibilities. The problem turns out to be reasonably easy on the projected space. The table-majority inducer attempts to find a matching instance in the training set and return that label. If it can't be found, it returns majority.


  • Vote: Predict whether a congressman is a republican or democrat (1984) based on key votes. The space is very big (3^16 = 43046721). The best projected spaces have 1-4 features. We chose to show a bigger space.