Chun-Po Wang | Kyle Wilson | Noah Snavely |
Cornell University |
The Internet contains a wealth of rich geographic information about our world, including 3D models, street maps, and many other data sources. This information is potentially useful for computer vision applications, such as scene understanding for outdoor Internet photos. However, leveraging this data for vision applications requires precisely aligning input photographs, taken from the wild, within a geographic coordinate frame, by estimating the position, orientation, and focal length. To address this problem, we propose a system for aligning 3D structure-from-motion point clouds, produced from Internet imagery, to existing geographic information sources, including Google Street View photos and Google Earth 3D models. We show that our method can produce accurate alignments between these data sources, resulting in the ability to accurately project geographic data into images gathered from the Internet, by “Googling” a depth map for an image using sources such as Google Earth.
Paper (PDF, 6.7MB) |
Supplementary material (PDF, 21MB) |
This work was supported by the NSF under grants IIS-1149393 and IIS-1111534, and by Intel Corporation, Google, and Microsoft. We thank Kevin Matzen with his assistance with the experiments in this paper.