SGI MLC++ Utilities (NT Edition) README file. November 1997 Machine Learning Library in C++. http://www.sgi.com/tech/mlc Keywords: machine learning, accuracy estimation, cross-validation, bootstrap, ID3, decision trees, decision graphs, naive-bayes, decision tables, majority, induction algorithms, classifiers, categorizers, general logic diagrams, instance-based algorithms, discretization, lazy learning, bagging, MineSet. MLC++ Team: The MLC++ team is continuing to work on research and development of machine learning techniques for data mining. The team is now working at Silicon Graphics, as part of the MineSet project for data mining and visualization: http://mineset.sgi.com. Core team members include: Ronny Kohavi (manager) Cliff Brunk Alex Kozlov Clay Kunz Dan Sommerfield Eric Eros (contractor). Questions or help requests related to the utilities should be addressed to mlc@postofc.corp.sgi.com Please see the MLC++ home page first: http://www.sgi.com/tech/mlc ______________________________________________________________________ Quick starter: The MLC++ utilities are accessible through our web page http://www.sgi.com/tech/mlc The NT version of the MLC++ utilities are stored in a .zip archive, and can be uncompressed by pkunzip as well as a number of other common utilities. Assuming pkunzip, follow this procedure to unpack the archive: 1. Make a directory to hold the utilities. We'll call this 2. Place the archive file in 3. Execute pkunzip within the directory, or use the windows version of pkunzip to unpack the files. The documentation is in utils.ps and it is currently OUT OF DATE. Expect an update in the next month. Environment variables: The MLC++ utilities make heavy use of environment variables while running. Environment variables may be set from the NT command prompt as follows: set = You can check the value of a variable as follows: echo %% The following environment variables need to be set to get MLC++ to run: MLCDIR must be set to the directory where the utilities are installed. MLCPATH to the directory where the databases are stored. Multiple directories may be specified through this variable and separated by colons. All directories specified in MLCDIR or MLCPATH must use FORWARD (unix-style) slashes. Drive letters should be specified as follows: //c/ where c can be replaced by the drive letter of your choice. Example: If you installed the utilities in C:\MLC, you would execute the following to set MLCDIR and MLCPATH: set MLCDIR=//c/MLC/db set MLCPATH=.://c/MLC/db Note: This method for specifying path names is identical to that supported by the GNU-Win32 project (see http://www.cygnus.com for details), except that we do not support volume mounting. File Formats: MLC++ can read ASCII files in both UNIX and DOS formats seamlessly. The sample data files provided with the kit are actually in UNIX format. MLC++ will always produce DOS format text files when running under NT. If you have not "registered" through the web page, please do so at http://www.sgi.com/tech/mlc/mail.html Databases in the MLC++ format, which is very similar to C4.5 format can be found in http://www.sgi.com/tech/mlc/db/ Most datafiles are converted from the repository at UC Irvine. ______________________________________________________________________