Project 1: Supervised Learning Using Baysian Decision Rule - Two Category Classification (Due 09/19)

Objective:

The objective of this project is, first of all, to learn how to implement supervised learning algorithms based on Baysian decision theory. The second objective is to get you familiar with the design flow when applying machine learning algorithms to solve real-world problems. Some practical considerations include, for example, 1) the selection of the right pdf model to characterize the data distribution in the training set, 2) the selection of the right ratio of prior probability, 3) the different ways to evaluate the performance of the learning algorithm, and 4) how differently the same ML algorithm performs when applied to different datasets.

Data Sets:

The synthetic dataset: synth.tr (the training set) and synth.te (the test set) from Ripley's Pattern Recognition and Neural Networks.

Algorithm:

You need to implement the three cases of the discriminant function (parametric learning) and kNN (non-parametric learning) based on Baysian decision theory, 1) minimum Euclidean distance classifier (linear machine), 2) minimum Mahalanobis distance classifier (linear machine), 3) the generic form of Baysian decision rule (quadratic machine), where Gaussian pdf is assumed, and 4) kNN.

Performance Metrics:

Three metrics are used to evaluate the performance of the ML algorithms, including 1) overall classification accuracy, 2) classwise classification accuracy, and 3) run time.

Tasks: