Final Project
The objective of the final project is to integrate various machine
learning techniques to achieve the best performance. Final project is
a group effort. Each group can have 2-4 members. You are required to
apply ALL techniques learned in this semester, including those for feature extraction/selection, dimensionality reduction, classification/regression, and fusion. Your design needs to be well justified and pros/cons thoroughly analyzed through comprehensive experimental design and performance evaluation.
Schedule
- (5) Milestone 1 (Due 11/28): Group Formation and Topic selection.
Submit through Canvas. Approval and comments will be
returned in one day. The same topic cannot be chosen by more than 1
group. The topic follows the first-come first-served rule. So pick a topic as soon as possible.
- (5) Milestone 2 - Literature Survey (Due 12/05):
Background study including references and state-of-the-art performance on the dataset (2-page
report need to be submitted on Canvas).
- (5) Milestone 3 - Prototype 1 (Due 12/07):
Prototype, preliminary results and task allocation among group members. Apply at least one learned technique successfully for each component in the pipeline on the chosen dataset and submit a 1-page report.
- (100) Final presentation (Due 12/13)(Presentation slides due after the presentation on 12/13. Submit through Canvas)
- (85) Final report (Due 12/14). Submit through Canvas.
Implement at least two solutions to each component of the pipeline. Determine what metrics to use. Provide performance evaluation results.
Potential Topics
Each group can choose one topic from the Kaggle
Competitions site or a topic from other sources. All selections need to be approved by the instructor.
Requirement
General steps involved in a machine learning problem include
- Data collection (raw data)
- Feature extraction (how to extract features from the raw data)
- Feature reduction (dimensionality reduction - Fisher's linear discriminant or PCA or t-SNE)
- Classification/Regression methods need to be included
- Supervised learning and Unsupervised learning
- Baysian approaches and non-Baysian approaches
- Parametric and Non-parametric density
estimation in supervised learning
- Fusion
- Performance evaluation
- Feedback system