Project 3 - Training VGGNet (Due 10/09)
Objectives:
The objective of this project is to be able to use tensorflow to write
benchmark CNN structures, i.e., VGGNet, and solve a larger-scale problem. Further,
through this project, you learn how to utilize pretrained models and
fine-tune on it to solve your own problem.
Data set used:
CIFAR-10.
- This site gives you an overview of what CIFAR-10 is and how to
download the dataset.
- This
site gives you an update on who is the best in CIFAR-10.
Requirements:
- Task 1: Train a VGGNet from scratch.
- Study the sample code
from Yang Song in Lecture 7 (VGGFace) on how to train a VGGNet from
scratch for face recognition purpose. Adapt the design to CIFAR-10.
- Modify hyperparameters to get to the best performance you can
achieve.
- Task 2: Use a pretrained VGG model.
- Download a pretrained VGG model.
- Fine-tune the hyperparameters (i.e., retrain the pretrained
model using CIFAR-10 dataset) for better performance
Note that this is an open-ended problem. The
leaderboard will be frequently updated to report the best in class.
Report
Please
submit the following through Canvas before midnight on the due date.
- The evaluation graphs (e.g., convergence graph) that you generated from both Tasks 1 and
2. Discuss on the effect of different hyperparameters settings.
- Discuss various design options regarding cost function and
activation function. Discuss your experience with training a CNN
structure from scratch vs. using a pretrained model. This part of the report should not exceed 2
pages (with double space)
- (For 692 students only) Read [LeCun:1998], [AlexNet:2012], [GoogLeNet:2014],
[VGGNet:2014], [ResNet:2015], and [SENet:cvpr:2018]. Discuss the core ideas that make
the structure such a success. This part of the
report should not exceed 3 pages (with double space).