MLC++: A Machine Learning Library in C++ 11/21/97 Cliff Brunk How to compile the MLC++ Sources under g++ ------------------------------------------- These notes are designed so MLC++ can be compiled on SUN using GNU GCC 2.7.2.3 or on Silicon Graphics using GCC 2.7.2.2. ** Please note that if you only want to work with MLC++, you can get the MLC++ utilities off our home page. There are precompiled versions for SGI, SUN, and NT, these are much easier to install and run. If you have any questions, send mail to mlc@postofc.corp.sgi.com. Compiling MLC++ under GNU ------------------------- *. Install the GNU gcc compiler. A precompiled executable version of GCC 2.7.2.3 for Solaris 2.5.1 can be found at: ftp://sunsite.unc.edu/pub/solaris/sparc/GNUgcc.2.7.2.3.SPARC.Solaris.2.5.1.pkg.tgz *. Get the MLCX.Y-src.tar.gz file from the MLC++ home page, where X.Y is the version number. *. Create a top level MLC++ directory and make that the current working directory. (e.g. cd; mkdir mlc; cd mlc) *. gzcat MLCX.Y-src.tar.gz | tar xf - *. Set MLCDIR to the top level MLC++ directory (e.g. setenv MLCDIR ~/mlc) *. Setup the MLC++ environment variables for the SUN (source setup.SUN.GNU) It is recommended that you put this in your .login *. Optional: Install dot/dotty in subdirectory graphviz Go to http://www.research.att.com/sw/tools/reuse/ click binary software license agreement and mark graphviz. *. Set your PATH to access: ar, ld, strip, as, gcc, g++, dot, dotty and all the external inducers that you will be using. The external inducers supplied with MLC++ (T2, cn2, ibl, oc1, pebls) can be built using the command: bin/buildexternal T2 cn2 ibl oc1 pebls The corresponding executables are stored in ${MLCDIR}/external/${MLC_EXTERNAL_TYPE} The following commands will modify your PATH allowing access to dot, dotty and lefty in the graphviz directory and the external inducer products stored in ${MLCDIR}/external/${MLC_EXTERNAL_TYPE}: setenv PATH "${MLCDIR}/graphviz/bin:${PATH}" setenv PATH "${MLCDIR}/external/${MLC_EXTERNAL_TYPE}:${PATH}:." *. Some testers use c4.5, c5.0, cart and dotty. These testers will fail if you do not have the products installed. If this is the case, simply do "touch t_XXXX" where t_XXXX is the name of the tester that failed (no suffix). The make will then ignore that tester. Note that these will be removed by a "make clean" or "make scratch" The verifiers for utilities (mlc/util dir) will also warn you about diff failures. If these are related to C4.5 and the other tools, this is expected. Note: MLC++ provides an interface to cart, however it requires that you rename the ascii conversion program from ascii to cart.ascii. *. Build MLC++ make - in the top level MLC++ directory builds and of the MLC++ libraries, utilities and testers make notests - builds the MLC++ libraries and utilities without testing make tests - builds the tests after the libraries have been built If something fails, you can type check the appropriate execution log in the tester name with a ".out" suffix. For example, the first file to fail if you don't have C4.5 is $MLCDIR/src/MTree/tests/t_C45Inducer. The t_C45Inducer.out file will contain the error "c4.5: not found." To continue after you fix a problem, type "make" in $MLCDIR/src or in any subdirectory. Use "make -r" when doing a make in the tests directories. The -r is required to prevent implicit rules from doing the wrong thing. The default is to compile all the library and run all the testers. This is a long process If you have to clean something, do it in a subdirectory. No need to do it at the src/ level. If the compilation failed in src/MGraph/tests, make clean in MGraph. *. If you develop code, we highly recommend working in DEBUGLEVEL 1 until the code works well. Known MLC++ problems on SUN --------------------------- All of the known problems are related to testers and we are currently working on solving them. Most involve differences in real number precision. The following is a list of the known problems: src/MTree/tests/t_RDGCat binary files differ (probably a SUN binary file issue) src/MTree/tests/t_treeviz precision difference in t_treeviz.exp14 src/MInd/tests/t_NaiveBayesCat precision difference t_NaiveBayesCat.exp3 src/MInd/tests/t_TableInducerMTrans/tests/t_PeblsInducer.tmp problem matching instances because of Real == src/MTrans/tests/t_CN2Inducer expected: ERROR : 0.2 observed: ERROR : 0.21 src/MTrans/tests/t_MultiSplitCat2 doesn't generate output files .out1, out2, out3 src/MTrans/tests/t_SGIDTInducer t_SGIDTInducer.exp7 generates a different set of models under GNU src/MFSS/tests/t_TableCasInd expected: ERROR : 0.633191489362 observed: ERROR : 0.632553191489 Cliff Brunk (brunk@sgi.com) ====================