next up previous contents
Next: General Logic Diagrams Up: Utilities Previous: Discretization



The conv utility provides simple conversions of the data for algorithms that do not deal well with categorical attributes or that require a slightly different input format. Two encodings for nominal attributes are provided:

Local encoding
Each value of a categorical attribute is made into an indicator attribute. For a given value in the data file, the appropriate indicator attribute is set to one, and all other indicator attributes that share the representation are set to zero. An unknown value causes all indicator attributes to be zero.

Binary encoding
A categorical variable with k possible values is assigned into bits. Value i is mapped into the binary representation of i+1, and the binary zero is allocated for unknown values.

Ronny Kohavi
Sun Oct 6 23:17:50 PDT 1996