Document Type Master's Dissertation Author Potgieter, Gavin URN etd-07222005-104751 Document Title Mining continuous classes using evolutionary computing Degree MSc (Computer Science) Department Computer Science Supervisor
Advisor Name Title Prof A P Engelbrecht Committee Chair Keywords
- data mining
Date 2003-04-01 Availability unrestricted AbstractData mining is the term given to knowledge discovery paradigms that attempt to infer knowledge, in the form of rules, from structured data using machine learning algorithms. Specifically, data mining attempts to infer rules that are accurate, crisp, comprehensible and interesting. There are not many data mining algorithms for mining continuous classes. This thesis develops a new approach for mining continuous classes. The approach is based on a genetic program, which utilises an efficient genetic algorithm approach to evolve the non-linear regressions described by the leaf nodes of individuals in the genetic program's population. The approach also optimises the learning process by using an efficient, fast data clustering algo¬rithm to reduce the training pattern search space. Experimental results from both algorithms are compared with results obtained from a neural network. The experimental results of the genetic program is also compared against a commercial data mining package (Cubist). These results indicate that the genetic algorithm technique is substantially faster than the neural network, and produces comparable accuracy. The genetic program produces substantially less complex rules than that of both the neural network and Cubist.
© 2002, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
Please cite as follows:
Potgieter, G 2002, Mining continuous classes using evolutionary computing, MEng dissertation, University of Pretoria, Pretoria, viewed yymmdd < http://upetd.up.ac.za/thesis/available/etd-07222005-104751/ >
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access dissertation.pdf 2.19 Mb 00:10:07 00:05:12 00:04:33 00:02:16 00:00:11