COLOR GLDs (General Logic Diagrams) are now
available in MLC++. Please note that they will print greyscale on most
printers, but may not show well on non-color monitors.
Since the full instance space is too big for most datasets, the GLDs can show
AND train an induction algorithm on a projected space. A good projection was
found by using a feature subset selection algorithm written in MLC++.
For each dataset, we show the training set (usually O's and X's) over the
predicted space, and then the test set over the predicted space. The different
colors denote the different classes. For the test set, O's mark the fact that
all test instances that fell in the box were correctly predicted, X's mark the
fact that there was at least one mistake.
- Zoo: Predict the class of an animal. There are seven classes.
Comments: Projecting the space almost "solves" the problem. Most instances
in the projected space match in their label, so this becomes an "easy"
problem. ID3's accuracy on the original space: 88.2% (hence it improves
on the projected space).
- Monk1: Predict whether the robot has jacket color read, or head-shape =
body-shape.
Comments: Three irrelevant features. The problem becomes trivial when
projected over the three relevant features, a subset that the MLC++
feature subset selector easily finds. We show the full space because we
can show it.
- Parity: The concept is the parity of 5 bits (Bits 1,2,3,5,7), with 5
irrelevant bits.
Comments: The concept is considered very hard, and very
unnatural. However, given the 5 relevant bits, all 32 instances are given
in the training set.
- Chess: King rook vs. king pawn on a7. The concept is whether white can win.
Comments: Space has 33 boolean features and one ternary feature, for a
total of 25,769,803,776 possibilities. The problem turns out to be
reasonably easy on the projected space. The table-majority inducer
attempts to find a matching instance in the training set and return that
label. If it can't be found, it returns majority.
- Vote: Predict whether a congressman is a republican or democrat (1984)
based on key votes. The space is very big (3^16 = 43046721). The
best projected spaces have 1-4 features. We chose to show a bigger
space.