Special Topics in Computer
Vision
CS7670, Fall 2011, Cornell University
Time: Tu/Th 2:55pm - 4:10pm
Place: Upson 315
Instructor: Noah Snavely (snavely@cs.cornell.edu)
Office: Upson 4157
Office Hours: TBA
|
|
In the past decade computer vision has made incredible progress across
the board, in geometry, recognition, image processing, and other areas.
In this graduate seminar in computer vision, we will survey and discuss
state-of-the-art research papers in this quickly moving field, with a
focus on 3D geometry estimation, image matching and retrieval, use of
the Internet to gather and annotate data, and scene understanding. This
will draw on papers from both computer vision and computer graphics
venues.
Prerequisites
Students are expected to have a working knowledge of computer vision at
the level of CS6670 (Computer Vision) or equivalent, and should be
willing and able to understand and analyze recent conference papers in
this area. If you are unsure if this course is right for you, please
come talk to me. Perusing a few papers on the syllabus is a good way to
gauge what kind of background is necessary. This course is expected to
be interactive, relevant to the latest research, and (most of all),
fun. Please send me email or speak to me if
you are unsure of whether you can take the course.
Preliminary
Schedule
Date |
Topics |
Papers and links |
Presenters |
Items due |
Aug 26 |
Course intro |
handout |
|
Topic preferences due via CMS
by Tuesday August 30 |
Aug 30 |
No class -- instructor out of
town |
|
|
|
Sep 1 |
No class -- instructor out of
town |
|
|
|
Sep 6 |
Object Detection and Exemplars
|
- *
Ensemble of Exemplar-SVMs for Object Detection and
Beyond. Malisiewicz, Gupta, Efros, ICCV 2011. [pdf,code,www]
- Recognition by
association via learning per-exemplar distances.
Malisiewicz and Efros, CVPR 2008. [pdf,www]
- Beyond Categories: The
Visual Memex Model for Reasoning About Object
Relationships. Malisiewicz and Efros, NIPS 2009. [pdf,
www]
- An exemplar model for learning
object classes. Chum and Zisserman, CVPR 2007. [pdf]
|
Noah
[ppt,pdf]
|
|
Sep 8 |
Saliency
|
- *
Learning to Predict Where Humans Look. T. Judd, K.
Ehinger, F. Durand, A. Torralba. ICCV 2009. [pdf,
www]
|
Noah
[ppt,pdf]
|
|
I. 3D
Geometry
|
Sep 13 |
Multi-view
Stereo
|
- * Reconstructing
Building Interiors from Images. Furukawa, Curless,
Seitz, Szeliski [pdf,
www]
- * Piecewise Planar and
Non-Planar Stereo for Urban Scene Reconstruction.
Gallup, Frahm, Pollefeys. CVPR 2010. [pdf,
www,
wmv]
- Manhattan-World
Stereo. Furukawa, Curless, Seitz, Szeliski, CVPR 2009.
[pdf,
www]
- Piecewise planar
stereo for image-based rendering. Sinha, Steedly, Szeliski, ICCV
2009. [pdf,
www]
|
Ivo
[pdf]
|
|
Sep 15 |
User-Assisted 3D Reconstruction
|
- * Interactive 3D
Architectural Modeling from Unordered Photo
Collections. Sinha, Steedly, Szeliski, Agrawala,
Pollefeys, SIGGRAPH Asia 2008. [pdf,
www]
- Active Learning for
Piecewise Planar 3D Reconstruction. Kowdle, Chang,
Gallagher, Chen, CVPR 2011. [pdf,
www]
- 3D Modeling with
Silhouettes. Rivers, Durand, Igarashi. SIGGRAPH
2010. [pdf,
www].
|
Adarsh
[pptx, pdf]
|
|
Sep 20 |
New 3D Sensors
|
- * Real-Time Human Pose
Recognition in Parts from Single Depth Images. Shotton,
et al, CVPR 2011. [pdf]
- RGB-D Mapping:
Using depth cameras for dense 3D modeling of indoor
environments. Henry, Krainin, Herbst, Ren, Fox.
ISER 2010. [pdf,
www]
- Autonomous
Generation of Complete 3D Object Models Using Next Best
View Manipulation Planning. Krainin, Curless, Fox,
ICRA 2011. [pdf]
- Kernel Descriptors for
Visual Recognition, Bo et al., NIPS 2010. [pdf]
- A Large-Scale
Hierarchical Multi-View RGB-D Object Dataset, Lai et
al., ICRA 2011. [pdf,www]
|
Zhaoyin
[pdf]
|
Project proposals due |
td>
Sep 22 |
Structure from
motion
|
- * Semantic Structure from
Motion. Bao and Savarese, CVPR 2011. [pdf,
www]
- Building Rome in a
Day. Agarwal, Snavely, Simon, Seitz, Szeliski. ICCV
2009. [pdf,
www,
code]
- Building Rome on a
Cloudless Day. Frahm, Georgel, Gallup, Johnson,
Raguram, Wu, Jen, Dunn, Clipp, Lazebnik, Pollefeys [pdf,
www,
code]
- Disambiguating
Visual Relations Using Loop Constraints. Zach,
Klopschitz, Pollefeys, CVPR 2010. [pdf]
|
Ian
[ppt, pdf]
|
|
II. Computational
Photography
|
Sep 27 |
Computational Photography
Intrinsic Images and White Balance
|
- * Light Mixture Estimation for Spatially Varying White
Balance. Hsu, Mertens, Paris, Avidan, Durand. SIGGRAPH
2008. [pdf,
www]
- * User Assisted Intrinsic Images. Bousseau, Paris,
Durand, SIGGRAPH Asia 2009. [pdf,
www]
|
Daniel, Ivo
[pdf
(Daniel), pdf
(Ivo)]
|
|
Sep 29 |
Computational Photography
Fun with Light Transport
|
- * Dual Photography. Sen, Chen, Garg, Marschner,
Horowitz, Levoy, Lensch. SIGGRAPH 2005. [pdf,
www]
- Optical Computing for Fast Light Transport
Analysis. O'Toole and Kutulakos, SIGGRAPH Asia 2010.
[pdf,
www]
- Compressive Light Transport Sensing. Peers,
Mahajan, Lamond, Ghosh, Matusik, Ramamoorthi, Debevec, TOG
2009. [pdf,
www]
- Wavelet Environment Matting. Peers, Dutre. EGSR
2003. [pdf,
www]
- Symmetric Photography: Exploiting Data-sparseness in
Reflectance Fields. Garg, Talvala, Levoy, Lensch, EGSR
2006. [pdf,
www]
|
Kevin [pdf] |
|
Oct 4 |
Illumination
|
- * Estimating Natural Illumination from a Single
Outdoor Image. Lalonde, Efros, Narasimhan, ICCV 2009.
[pdf,
www]
- Detecting Ground Shadows in Outdoor Consumer
Photographs. Lalonde, Efros, Narasimhan. ECCV 2010.
[pdf,
www]
- Single-Image Shadow Detection and Removal using Paired
Regions. Guo, Dai, Hoiem, CVPR 2011. [pdf,
www]
|
Chun-Po
[ppt, pdf]
|
|
III. Image Matching and
Retrieval |
Oct 6 |
Large-Scale Image Collections
|
- * Small codes and large
databases for recognition. Torralba, Fergus, Weiss,
CVPR 2008. [pdf,
www]
- * ImageNet: A
Large-Scale Hierarchical Image Database. J. Deng, W.
Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, CVPR 2009.
[pdf,
www]
- Nonparametric scene
parsing: Label transfer via dense scene alignment. C.
Liu, J. Yuen and A. Torralba. CVPR, 2009. [pdf,
www]
- 80 million tiny images:
a large dataset for non-parametric object and scene
recognition. Torralba, Fergus, Freeman. PAMI 2008. [pdf]
- Attribute Learning in
Large-scale Datasets. O. Russakovsky and L. Fei-Fei,
Proc. ECCV Workshop on Parts and Attributes, 2010. [pdf]
- What does classifying
more than 10,000 image categories tell us? J. Deng, A.
Berg, K. Li and L. Fei-Fei, ECCV 2010. [pdf]
|
Henry and Yimeng
[pdf]
|
|
Oct 11 |
Fall break -- no classes |
- |
- |
- |
Oct 13 |
Image Representations
|
- * What You Saw is Not
What You Get: Domain Adaptation Using Asymmetric Kernel
Transforms. Kulis, Saenko, Darrell, CVPR 2011. [pdf]
- * Informative Feature
Selection for Object Recognition via Sparse PCA.
Naikal, Yang, Sastry, ICCV 2011. [pdf,
www]
- Image Retrieval with
Geometry Preserving Visual Phrases. Zhang, Jia, Chen, CVPR
2011. [pdf]
- Beyond Bags of Features:
Spatial Pyramid Matching for Recognizing Natural Scene
Categories. Lazebnik, Schmid, Ponce, CVPR 2006. [pdf,
code,
slides]
- Video Google: A Text
Retrieval Approach to Object Matching in Videos. Sivic and Zisserman,
ICCV 2003.
[pdf,
demo]
- Scalable Recognition
with a Vocabulary Tree. Nister and Stewenius, CVPR
2006. [pdf,
slides]
|
Song
[pptx]
|
|
Oct 18 |
Instructor out of town -- no class |
- |
- |
- |
Oct 20 |
Guest Lecture -- Andy Gallagher (Kodak), Dhruv Batra
(TTI) |
|
|
|
Oct 25 |
Image Representations (Sparse Coding)
|
- * Linear Spatial Pyramid
Matching Using Sparse Coding for Image Classification.
Yang, Yu, Gong, Huang, CVPR 2009. [pdf,
www]
- * Locality-constrained
Linear Coding for Image Classification. Wang, Yang, Yu,
Lv, Huang, Gong. CVPR 2010. [pdf,
www]
|
Ruogu |
|
Oct 27 |
Feature Detection and Matching
|
- * Edge Foci Interest
Points. Zitnick and Ramnath, ICCV 2011. [pdf,
www]
- Boundary-Preserving
Dense Local Regions. Kim and Grauman, CVPR 2011. [pdf,
www]
- LDAHash: Improved
Matching with Smaller Descriptors. Strecha, Bronstein,
Bronstein, Fua, PAMI Submission. [pdf,
code,
www]
- Object Recognition from
Local Scale-Invariant Features. Lowe, IJCV 2004. [pdf,
code, other implementations of SIFT]
- Local Invariant Feature
Detectors: A Survey. Tuytelaars and Mikolajczyk.
Foundations and Trends in Computer Graphics and Vision,
2008. [pdf] [Oxford code] [Read pp. 178-188, 216-220,
254-255]
- SURF: Speeded Up Robust
Features. Bay, Ess, Tuytelaars, and Van Gool, CVIU
2008. [pdf] [code]
- Robust Wide Baseline Stereo
from Maximally Stable Extremal Regions. J. Matas, O.
Chum, U. Martin, and T. Pajdla, BMVC 2002. [pdf]
- A Performance Evaluation of
Local Descriptors. Mikolajczyk and Schmid, CVPR 2003.
[pdf]
- Oxford group interest point
software
- Andrea Vedaldi's code, including
SIFT, MSER, hierarchical k-means.
- INRIA LEAR team's software,
including interest points, shape features
|
Daniel |
Project updates due
Friday |
Nov 1 |
Machine Learning for Image Matching
|
- * Fast Keypoint
Recognition using Random Ferns. Özuysal, Calonder,
Lepetit, Fua, PAMI, March 2010. [pdf,
www]
- * Decision Tree Fields.
Nowozin, Rother, Bagon, Yao, Sharp, Kohli, ICCV 2011. [pdf]
- Descriptor Learning for
Efficient Retrieval. Philbin , Isard, Sivic,
Zisserman. ECCV 2010. [pdf]
- Learning a Fine
Vocabulary. Mikulık, Perdoch, Chum, Matas. ECCV
2010. [pdf]
|
Ian, Song |
|
IV: Object Recognition and Scene
Understanding
|
Nov 3 |
Geometric Context
|
- * Closing the Loop on Scene Interpretation. Hoiem,
Efros, and Hebert, CVPR 2008.
- * Recovering Occlusion Boundaries from a Single
Image. Hoiem, Stein, Efros, and Hebert. [pdf,
www,
code]
- Recovering Surface Layout from a Single Image.
Hoiem, Efros, and Hebert. [pdf,
code]
- Thinking Inside the Box: Using Appearance Models and
Context Based on Room Geometry. Hedau, Hoiem, Forsyth,
ECCV 2010. [pdf]
- Recovering the Spatial Layout of Cluttered Rooms.
Hedau, Hoiem, Forsyth, ICCV 2009. [pdf,
code,
www]
- Segmenting Scenes by Matching Image Composites.
Russell, Efros, Sivic, Freeman, Zisserman, NIPS 2009.
[pdf,
www]
- Learning a dense multi-view representation for
detection, viewpoint classification and synthesis of object
categories. Su, Sun, Li, Savarese, ICCV 2009. [pdf]
|
Zhaoyin, Adarsh |
|
Nov 8 |
Attributes
|
- * Describing Objects by their Attributes. Farhadi,
Endres, Hoiem, Forsyth, CVPR 2009. [pdf,
www]
- * Relative Attributes. Parikh and Grauman, ICCV
2011.
[pdf,
www]
- Attribute-Centric Recognition for Cross-Category
Generalization. Farhadi, Endres, Hoiem, CVPR 2010. [pdf]
|
Amir, Ruogu |
|
Nov 10 |
Materials
|
- * Inferring Reflectance under Real-world
Illumination. Romeiro, Zickler, IJCV. [pdf]
- * Exploring features in a Bayesian framework for
material recognition. Liu, Sharan, Adelson, Rosenholtz,
CVPR 2010. [pdf,
www]
- What An Image Reveals About Material Reflectance.
Chandraker, Ramamoorthi, ICCV 2011. [pdf]
|
Kevin, Chun-Po |
|
Nov 15 |
No class |
- |
- |
- |
Nov 17 |
No class |
- |
- |
- |
Nov 22 |
No class |
- |
- |
- |
Nov 24 |
Thanksgiving -- no classes |
- |
- |
- |
Nov 29 |
Event Recognition from Videos
|
* Learning realistic human actions from movies.
I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. In
CVPR 2008. [pdf,www]
- Activity recognition using the velocity histories of
tracked keypoints. R. Messing, C. Pal, and H. A. Kautz.
ICCV 2009. [pdf,www]
- Behavior recognition via sparse spatio-temporal
features. P. Dollar, V. Rabaud, G. Cottrell, and S. J.
Belongie. PETS Workshop, 2005. [pdf]
- A “string of feature graphs” model for recognition of
complex activities in natural videos. U. Gaur, Y. Zhu,
B. Song, and A. Roy-Chowdhury. ICCV 2011. [pdf]
|
Yimeng
[pdf]
|
|
Dec 1 |
Image-to-text and Recognition in social context
|
- * Seeing People in Social Context: Recognizing People
and Social Relationships. Wang, Gallagher, Luo, and
Forsyth. ECCV 2010. [pdf]
- * Baby Talk: Understanding and Generating Simple Image
Descriptions. Kulkarni, Premraj, Dhar, Li, Choi, Berg,
and Berg. CVPR 2011. [pdf]
- Autotagging Facebook: Social Network Context Improves
Photo Annotation. Stone, Zickler, Darrell. Workshop on
Internet Vision. [pdf]
- Understanding Images of Groups of People.
Gallagher and Chen. CVPR 2009. [pdf]
- Estimating Age, Gender and Identity using First Name
Priors. Gallagher and Chen, CVPR 2008. [www]
|
Amir and Henry
|
- |
|
|
|
|
|
Dec 8
|
|
|
|
Final presentations |
Course
Resources
TBA.
Academic
Integrity
This course follows the Cornell University Code of Academic Integrity.
Each student in this course is expected to abide by the Cornell
University Code of Academic Integrity. Any work submitted by a student
in this course for academic credit must be the student's own work.
Violations of the rules will not be tolerated.
|