Date |
Topic (with linked notes / slides) |
Additional reading |
Aug 22 |
Introduction |
|
Aug 24 |
Image Formation - Geometry |
|
Aug 29 |
All about rotations | Image formation - color |
|
Aug 31 |
Reconstruction - I |
|
Sep 5 |
Reconstruction - II (Epipolar Geometry) |
|
Sep 7 |
The correspondence problem
| |
Sep 12 |
Optical flow |
Szeliski 8.4 |
Sep 14 |
Grouping |
Contour detection
Graph-based segmentation (Szeliski 5.4, 5.5)
Segmentation for object proposals (Selective search)
|
Sep 19 |
Introduction to machine learning
Example case: logistic regression
Empirical risk minimization
Classical (pre-convnet) recognition
|
Bag-of-words, Spatial pyramids
|
Sep 21 |
Non-linear classifiers and Neural networks
Convolutional networks
|
Deformable part models
MNIST (Sections I, II and III. Also read the rest and contemplate cyclical nature of research)
|
Sep 26 |
Backpropagation and computation graphs
Image classification
|
ImageNet
|
Sep 28 |
Transfer learning
Convolutional network architectures |
Transfer learning (Many examples)
VGG16, VGG19, 3x3 convolutions
Batch normalization
Highway networks
Residual networks
|
Oct 3 |
Object detection
Datasets and metrics
|
R-CNN
Fast R-CNN
Faster R-CNN
SSD
|
Oct 5 |
Semantic segmentation
Datasets and metrics
|
FCN, skip connections
Dilated convolutions, CRFs
|
Oct 12 |
Instance segmentation
Pose Estimation
Datasets and metrics
|
Dataset, metrics, segmentation as region classification
Hypercolumns / skip connections, segmentation as detection refinement
Instance segmentation using FCNs
Heatmap representations, graphical model based refinement
Sequential prediction, autocontext and inference machines
Hourglass architectures
|
Oct 17 |
Learning for 3D
Datasets and metrics
|
Rigid body pose estimation
Deep stereo
Learning to correspond for stereo
Depth estimation from a single image
Normal estimation from a single image
|
Oct 19 |
Learning correspondence
|
Learning optical flow from simulated data
Learning from hallucinated data
Learning from constraints
|
Oct 31 |
Detour: Writing
Video recognition
Datasets and metrics
|
Video classification as frame+flow classification
CNN+LSTM
3D convolution
I3D
|
Nov 2 |
Vision and language
|
Captioning
Visual question answering
Attention-based systems
Problems with VQA
|
Nov 7 |
Reducing supervision
One- and Few-shot learning
|
Classic unsupervised learning (See Chapter 2)
Self-supervised learning
Learning from noisy labels
|
Nov 9 |
Vision and action
Active perception
|
Learning from ego-motion
Learning tasks in robotics
|
Nov 14 |
GANs
|
Generative Adversarial Networks
CycleGANs
|
Nov 16 |
Adversarial examples and interpreting convnets
|
Adversarial examples
|
Nov 30 |
Taking inspiration from biology
|
Invariance in biological vision
Comparing classical computer vision with the brain
Comparing deep networks with the brain
The development of embodied cognition
|