top of page

Transfer learning and Efficient Training of deep learning models, sample assignment

Updated: Oct 5, 2021

Assignment objective

This assignment is to feedback on your learning in deep learning theory and its application to data analytics or artificial intelligence problems.


It builds on Assignment 1 but requires a higher level of mastery of deep learning theory and programming/engineering skills. In particular, you will experience training a much deeper network on a large-scale dataset. You will encounter practical issues that help you consolidate text book learning.


Task 1

Solving MNIST with Convolutional Neural Net- works


In Assignment 1, you tackled the image classification problem in MNIST. There, you used a Densely Connected Neural Network. You should now know that is not an optimal model architecture for the problem. In Assignment 2, you will apply the best practices of deep-learning computer vision to achieve better image classification performance.


Task 1.1

Revisit MNIST classification with DNN


Review your Assignment 1 solution, and reproduce the experiment here. Try to improve the model without changing the model architecture.


Task 1.2

Train a ConvNet from scratch


Build a ConvNet to replace the densely connected network in Task 1.1. Report the classification accuracy on the test set. Aim to achieve higher accuracy.


Task 1.3

Build an input pipeline for data augmentation


Build a data preprocessing pipeline to perform data augmentation. (You may use Keras Image Data Generator or write your own transformations.)

  • Report the new classification accuracy. Make sure that you use the same number of training epochs as in Task1.2.

  • (Optional) Profile your in put pipeline to identify the most time-consuming operation. What actions have you taken to address that slow operation? (Hint: You may use the Tensor flow profiler.)


Task 1.4

MNIST with transfer learning


Use a pretrained model as the convolutional base to improve the classification performance.(Hint: You may use model sin Keras Applications or those in the Tensor Flow Hub.)

  • Try both with fine-tuning and without fine-tuning

  • Report the model performance as before

Task 1.5

Performance comparison

How many parameters are trainable in each of the two settings (with and without fine-tuning)? How does the difference impact the training time?

Which setting achieved higher accuracy? Why did it work better for this problem? Have we benefitted from using the pretrained model?


Task 2

Fast training of deep networks



Task 2.1

Train a highly accurate network for CIFAR10


In this task, you will train deep neural networks on CIFAR dataset. Compared with the datasets that you have worked on so far, CIFAR10 repre- sents a relatively larger multi-class classification problem and presents a great opportunity for you to solve a "harder" problem.


Task 2.1.1
Document the hardware used

Before you start, write down your hardware specifications, including

  • the GPU model, the number of GPUs, and the GPU memory

  • the CPU model, the number of CPUs, and the CPU clock speed

(Hint: you may find commands like nvidia-smi, l scpu or psut i l useful.)


Task 2.1.2
Train a "shallow" ConvNet

Build a ConvNet with fewer than 10 layers. Train the network until it converges. You will use this network as a baseline for the later experiments.

  • Plot the training and validation history

  • Report the testing accuracy


Task 2.1.3
Train a ResNet

Train a residual neural network (ResNet) on the CIFAR10 training data and report the test accuracy and the training time.

The ResNet is a popular network architecture for image classification. You may find more information about how ResNet works by reading this paper.

(You may implement a resnet model or use an existing implementation. In either case, you should not use pretrained network weights.)


Task 2.2

Fast training of ResNet


In this task, you will experiment with different ways to reduce the time for training your ResNet on CIFAR10. There are different ways to speedup neural network training; below are two ideas. Please select at least one idea to implement. Explain the experiment steps and report the final performance and training time.


Option 1. Learning rate schedule

Use a learning rate schedule for the training. Some popular learning rate schedules include.

  • the Step Decay learning rate (e.g., see here)

  • Cyclical learning rates

  • The exponential learning rate

Also, Keras provides some convenient functions that you can use.


Option 2. Look ahead optimiser

Read this paper and implement the Lookahead optimiser


Task 2.3

Performance comparison


Based on the above experiments, which method or which combination of methods result in the best accuracy with the same training time.


Task 3

(HD level task) Research on new models

Today, ResNet has become a very mature ConvNet architecture. In this task, you will research one recent ConvNet architecture. You may choose an architecture from the reference list below.

Write a short report for your research, covering these points:

  • Identify the main issues that your chosen architecture aims to address. (For example, does it try to reduce the number of parameters or to speed up the training?)

  • What measures the architecture used to reduce the number of parameters, or reducing the training cost, or improving the model performance?

Implement the architecture and compare its performance on CIFAR10 with ResNet. You may include your implementation, experiments, and analyses here in this notebook.



bottom of page