Project 4 - Neural Network, Decision Tree, and Performance Evaluation - Due 04/23/19
Basic requirement (80)
All the following tasks will be based on the "fglass" data set from
Ripley's book. The dataset is divided into 10 subsets, so that it's
easier for the TA to grade on the performance. You should use 10-fold
cross validation for all the tasks required, unless it's otherwise
specified.
- Task 1 (30 pts): Use kNN as an example, implement a procedure
such that it is generalized enough to handle m-fold cross validation
on any learning algorithms. Try out different k's. Based on the
performance from cross-validation, determine the best k for this
data set. You need to get enough samples of k between 1 and sqrt(n),
e.g., 1, 5...15, and sqrt(n).
- Task 2 (25 pts): Use existing packages to implement a decision
tree for classification purpose using 10-fold cross validation.
- Task 3 (25 pts): Use existing packages to implement a 3-layer
neural network. Use 80-10-10 for the number of training, validation,
and test sets. Try out different numbers of hidden nodes, e.g., 2,
..., 15, and plot the performance curve.
Report (20)