The Bayessian Classifier is working...now I'm onto the Decision Tree classifier. What I noticed immediately is that these two classifiers don't really appear to be useful for the same types of data. Dr. Hung gave me several sets of data and I think that they would all be classified more accurately with the Bayesian Classifier.
Before I started this project, I hadn't really thought about how different data categorizes in different ways. The book I've been using for my algorithms is the excellent book by Toby Seagaran, Programming Collective Intelligence. In the chapter where he compares these classifiers, he points out how data can masquerade as meaningful data, but if it's being manipulated in the wrong way, will be meaningless. It's very important to pick the right type of classifier for the right type of data.
My challenge for coding the decision tree, is to code it so that I can use the iris measurement data AND the data in the Collective Intelligence book. My time is starting to run very short, but so far, the DTC's logic is much easier to code and test.
The categories for both of these have been pretty simple. I'm wondering what would happen if my categories got more complex.