Next: General Logic Diagrams
Up: Utilities
Previous: Discretization
The conv utility provides simple conversions of the data for
algorithms that do not deal well with categorical attributes or
that require a slightly different input format. Two encodings
for nominal attributes are provided:
- Local encoding
- Each value of a categorical attribute is made
into an indicator attribute. For a given value in the data file, the
appropriate indicator attribute is set to one, and all other
indicator attributes that share the representation are set to
zero. An unknown value causes all indicator attributes to be
zero.
- Binary encoding
- A categorical variable with k possible
values is assigned into
bits. Value i is
mapped into the binary representation of i+1, and the binary
zero is allocated for unknown values.

Ronny Kohavi
Sun Oct 6 23:17:50 PDT 1996