Paul Upchurch

PhD Candidate
Computer Science Department
Cornell University

345 Gates Hall
paulu at cs.cornell.edu

About me

I am a PhD candidate at Cornell University working in the Graphics and Vision Group. My advisors are Kavita Bala, Noah Snavely and Kilian Q. Weinberger. My research interests are in computer vision, machine learning, computer graphics, and human computation. My work is on data-driven models for understanding and editing appearance in photographs and new approaches for labeling images with crowdsourced workers.

[ Google Scholar ]

Publications

	Deep Feature Interpolation For Image Content Changes Paul Upchurch^, Jacob Gardner^, Geoff Pleiss, Robert Pless, Noah Snavely, Kavita Bala, Kilian Q. Weinberger. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017) ^*Authors contributed equally [ Webpage, PDF, Code ] Abstract We propose Deep Feature Interpolation (DFI), a new data-driven baseline for automatic high-resolution image transformation. As the name suggests, it relies only on simple linear interpolation of deep convolutional features from pre-trained convnets. We show that despite its simplicity, DFI can perform high-level semantic transformations like "make older/younger", "make bespectacled", "add smile", among others, surprisingly well—sometimes even matching or outperforming the state-of-the-art. This is particularly unexpected as DFI requires no specialized network architecture or even any deep network to be trained for these tasks. DFI therefore can be used as a new baseline to evaluate more complex algorithms and provides a practical answer to the question of which image transformation tasks are still challenging in the rise of deep learning. BibTeX @InProceedings{upchurch2017deep, author = {Paul Upchurch and Jacob Gardner and Geoff Pleiss and Robert Pless and Noah Snavely and Kavita Bala and Kilian Q. Weinberger}, title = {Deep Feature Interpolation For Image Content Changes}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {July}, year = {2017} }
	Interactive Consensus Agreement Games For Labeling Images Paul Upchurch, Daniel Sedra, Andrew Mullen, Haym Hirsh, Kavita Bala The AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2016) [ PDF Slides ] Abstract Scene understanding algorithms in computer vision are improving dramatically by training deep convolutional neural networks on millions of accurately annotated images. Collecting large-scale datasets for this kind of training is challenging, and the learning algorithms are only as good as the data they train on. Training annotations are often obtained by taking the majority label from independent crowdsourced workers using platforms such as Amazon Mechanical Turk. However, the accuracy of the resulting annotations can vary, with the hardest-to-annotate samples having prohibitively low accuracy. Our insight is that in cases where independent worker annotations are poor more accurate results can be obtained by having workers collaborate. This paper introduces consensus agreement games, a novel method for assigning annotations to images by the agreement of multiple consensuses of small cliques of workers. We demonstrate that this approach reduces error by 37.8% on two different datasets at a cost of $0.10 or $0.17 per annotation. The higher cost is justified because our method does not need to be run on the entire dataset. Ultimately, our method enables us to more accurately annotate images and build more challenging training datasets for learning algorithms. BibTeX @InProceedings{upchurch2016interactive, author = {Paul Upchurch and Daniel Sedra and Andrew Mullen and Haym Hirsh and Kavita Bala}, title = {Interactive Consensus Agreement Games For Labeling Images}, booktitle = {The AAAI Conference on Human Computation and Crowdsourcing (HCOMP)}, month = {October}, year = {2016} }
	Material Recognition in the Wild with the Materials in Context Database Sean Bell^, Paul Upchurch^, Noah Snavely, Kavita Bala The IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) ^Authors contributed equally [ Webpage, PDF Code & Data ] Abstract Recognizing materials in real-world images is a challenging task. Real-world materials have rich surface texture, geometry, lighting conditions, and clutter, which combine to make the problem particularly difficult. In this paper, we introduce a new, large-scale, open dataset of materials in the wild, the Materials in Context Database* (MINC), and combine this dataset with deep learning to achieve material recognition and segmentation of images in the wild. MINC is an order of magnitude larger than previous material databases, while being more diverse and well-sampled across its 23 categories. Using MINC, we train convolutional neural networks (CNNs) for two tasks: classifying materials from patches, and simultaneous material recognition and segmentation in full images. For patch-based classification on MINC we found that the best performing CNN architectures can achieve 85.2% mean class accuracy. We convert these trained CNN classifiers into an efficient fully convolutional framework combined with a fully connected conditional random field (CRF) to predict the material at every pixel in an image, achieving 73.1% mean class accuracy. Our experiments demonstrate that having a large, well-sampled dataset such as MINC is crucial for real-world material recognition and segmentation. BibTeX @article{bell15minc, author = "Sean Bell and Paul Upchurch and Noah Snavely and Kavita Bala", title = "Material Recognition in the Wild with the Materials in Context Database", journal = "Computer Vision and Pattern Recognition (CVPR)", year = "2015", }
	Reasoning about Photo Collections using Models of Outdoor Illumination Daniel Hauagge, Scott Wehrwein, Paul Upchurch, Kavita Bala, Noah Snavely British Machine Vision Conference (BMVC 2014) [ Webpage, PDF ] Abstract Natural illumination from the sun and sky plays a significant role in the appearance of outdoor scenes. We propose the use of sophisticated outdoor illumination models, developed in the computer graphics community, for estimating appearance and timestamps from a large set of uncalibrated images of an outdoor scene. We first present an analysis of the relationship between these illumination models and the geolocation, time, surface orientation, and local visibility at a scene point. We then use this relationship to devise a data-driven method for estimating per-point albedo and local visibility information from a set of Internet photos taken under varying, unknown illuminations. Our approach significantly extends prior work on appearance estimation to work with sun-sky models, and enables new applications, such as computing timestamps for individual photos using shading information. BibTeX @inproceedings{hauagge_bmvc2014_outdoor, author = "Daniel Hauagge and Scott Wehrwein and Paul Upchurch and Kavita Bala and Noah Snavely", title = "Reasoning about Photo Collections using Models of Outdoor Illumination", booktitle = "Proceedings of BMVC", year = "2014", }
	OpenSurfaces: A Richly Annotated Catalog of Surface Appearance Sean Bell, Paul Upchurch, Noah Snavely, Kavita Bala ACM Transactions on Graphics (SIGGRAPH 2013) [ Webpage, PDF Code & Data ] Abstract The appearance of surfaces in real-world scenes is determined by the materials, textures, and context in which the surfaces appear. However, the datasets we have for visualizing and modeling rich surface appearance in context, in applications such as home remodeling, are quite limited. To help address this need, we present OpenSurfaces, a rich, labeled database consisting of thousands of examples of surfaces segmented from consumer photographs of interiors, and annotated with material parameters (reﬂectance, material names), texture information (surface normals, rectiﬁed textures), and contextual information (scene category, and object names). Retrieving usable surface information from uncalibrated Internet photo collections is challenging. We use human annotations and present a new methodology for segmenting and annotating materials in Internet photo collections suitable for crowdsourcing (e.g., through Amazon’s Mechanical Turk). Because of the noise and variability inherent in Internet photos and novice annotators, designing this annotation engine was a key challenge; we present a multi-stage set of annotation tasks with quality checks and validation. We demonstrate the use of this database in proof-of-concept applications including surface retexturing and material and image browsing, and discuss future uses. OpenSurfaces is a public resource available at http://opensurfaces.cs.cornell.edu/. BibTeX @article{bell13opensurfaces, author = "Sean Bell and Paul Upchurch and Noah Snavely and Kavita Bala", title = "Open{S}urfaces: A Richly Annotated Catalog of Surface Appearance", journal = "ACM Trans. on Graphics (SIGGRAPH)", volume = "32", number = "4", year = "2013", }
	Tightening the Precision of Perspective Rendering Paul Upchurch, Mathieu Desbrun Journal of Graphics Tools (JGT 2012) [ PDF ] Abstract Precise depth calculation is of crucial importance in graphics rendering. Improving precision raises the quality of all downstream graphical techniques that rely on computed depth (e.g., depth buffers, soft and hard shadow maps, screen space ambient occlusion, and 3D stereo projection). In addition, the domain of correctly renderable scenes is expanded by allowing larger far-to-near plane ratios and smaller depth separation between mesh elements. Depth precision is an ongoing problem as visible artifacts continue to plague applications from interactive games to scientific visualizations despite advances in graphics hardware. In this paper we present and analyze two methods that greatly impact visual quality by automatically improving the precision of depth values calculated in a standard perspective divide rendering system such as OpenGL or DirectX. The methods are easy to implement and compatible with 1/Z depth value calculations. The analysis can be applied to any depth projection based on the method of homogeneous coordinates. BibTeX @article{upchurch2012tightening, author = {Paul Upchurch and Mathieu Desbrun}, title = {Tightening the Precision of Perspective Rendering}, journal = {Journal of Graphics Tools}, volume = {16}, number = {1}, pages = {40--56}, year = {2012} }

Technical Reports

Deep Manifold Traversal: Changing Labels with Convolutional Features

Jacob Gardner^*, Paul Upchurch^*, Matt Kusner, Yixuan Li, Kilian Q. Weinberger, Kavita Bala, John E. Hopcroft

arXiv 2016

^*Authors contributed equally

[ PDF ]

Abstract

Many tasks in computer vision can be cast as a "label changing" problem, where the goal is to make a semantic change to the appearance of an image or some subject in an image in order to alter the class membership. Although successful task-specific methods have been developed for some label changing applications, to date no general purpose method exists. Motivated by this we propose deep manifold traversal, a method that addresses the problem in its most general form: it first approximates the manifold of natural images then morphs a test image along a traversal path away from a source class and towards a target class while staying near the manifold throughout. The resulting algorithm is surprisingly effective and versatile. It is completely data driven, requiring only an example set of images from the desired source and target domains. We demonstrate deep manifold traversal on highly diverse label changing tasks: changing an individual's appearance (age and hair color), changing the season of an outdoor image, and transforming a city skyline towards nighttime.

BibTeX

@article{gardner2015deep, title={Deep manifold traversal: Changing labels with convolutional features}, author={Gardner, Jacob R and Upchurch, Paul and Kusner, Matt J and Li, Yixuan and Weinberger, Kilian Q and Bala, Kavita and Hopcroft, John E}, journal={arXiv preprint arXiv:1511.06421}, year={2015} }

From A to Z: Supervised Transfer of Style and Content Using Deep Neural Network Generators

Paul Upchurch, Noah Snavely, Kavita Bala

arXiv 2016

[ PDF ]

Abstract

We propose a new neural network architecture for solving single-image analogies - the generation of an entire set of stylistically similar images from just a single input image. Solving this problem requires separating image style from content. Our network is a modified variational autoencoder (VAE) that supports supervised training of single-image analogies and in-network evaluation of outputs with a structured similarity objective that captures pixel covariances. On the challenging task of generating a 62-letter font from a single example letter we produce images with 22.4% lower dissimilarity to the ground truth than state-of-the-art.

BibTeX

@article{upchurch2016z, title={From A to Z: supervised transfer of style and content using deep neural network generators}, author={Upchurch, Paul and Snavely, Noah and Bala, Kavita}, journal={arXiv preprint arXiv:1603.02003}, year={2016} }