UprightNet: Geometry-Aware Camera Orientation Estimation from Single Images

Wenqi Xian¹ Zhengqi Li¹ Matthew Fisher² Jonathan Eisenmann² Eli Shechtman² Noah Snavely¹

¹Cornell Tech ²Adobe Research * Equal Contribution

In ICCV, 2019

Abstract

We introduce UprightNet, a learning-based approach for estimating 2DoF camera orientation from a single RGB image of an indoor scene. Unlike recent methods that leverage deep learning to perform black-box regression from image to orientation parameters, we propose an end-to-end framework that incorporates explicit geometric reasoning. In particular, we design a network that predicts two representations of scene geometry, in both the local camera and global reference coordinate systems, and solves for the camera orientation as the rotation that best aligns these two predictions via a differentiable least squares module. This network can be trained end-to-end, and can be supervised with both ground truth camera poses and intermediate representations of surface geometry. We evaluate UprightNet on the single-image camera orientation task on synthetic and real datasets, and show significant improvements over prior state-of-the-art approaches.

[Paper] [Supplemental]

@inproceedings{xian2019uprightnet, title={Uprightnet: Geometry-aware camera orientation estimation from single images}, author={Xian, Wenqi and Li, Zhengqi and Fisher, Matthew and Eisenmann, Jonathan and Shechtman, Eli and Snavely, Noah}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, pages={9974--9983}, year={2019} }

Dataset and Training/Evaluation Code

GitHub

Acknowledgements

We thank Geoffrey Oxholm, Qianqian Wang, and Kai Zhang for helpful discussion and comments. This work was funded in part by the National Science Foundation (grant IIS-1149393), and by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program.