------------------------------------------------------------------------------- | Bundler v0.4 User's Manual | | copyright 2008-2009 Noah Snavely (snavely@cs.cornell.edu) | | | | based on the Photo Tourism work of Noah Snavely, Steven M. Seitz, | | (University of Washington) and Richard Szeliski (Microsoft Research) | | | | For more technical details, visit http://phototour.cs.washington.edu | ------------------------------------------------------------------------------- Table of ContentsI. What is Bundler? II. Conditions of use III. What's included in this distribution IV. Before you begin V. Running bundler VI. Output format and scene representation VII. Command-line options VIII. Acknowledgements IX. Contact information
I. What is Bundler?Bundler is a structure-from-motion system for unordered image collections (for instance, images from the Internet). Bundler takes a set of images, image features, and image matches as input, and produces a 3D reconstruction of the camera and (sparse) scene geometry as output. The system, described in [1] and [2], reconstructs the scene incrementally, a few images at a time, using a modified version of the Sparse Bundle Adjustment package of Lourakis and Argyros [3] as the underlying optimization engine.Currently, Bundler has primarily been compiled and tested under Linux (though Windows versions for cygwin and Visual Studio 2005 have also been released).
II. Conditions of useBundler is distributed under the GNU General Public License. For information on commercial licensing of this software, please contact the authors at the address given below.
III. What's included in this distributionIncluded with the binary distribution is the Bundler executable (bin/bundler), as well as a number of other utility scripts and executables (in the bin/ directory). In addition, there are a number of example image sets (and example results) under the examples/ directory. A version of the approximate nearest neighbors (ANN) library of David M. Mount and Sunil Arya, customized for searching verctors of unsigned bytes, is also included.A utility program for converting bundle files (.out) to the input required by Dr. Yasutaka Furukawa's PMVS multi-view stereo system (online here) called Bundle2PMVS is also included. This distribution also includes a program called RadialUndistort for generating undistorted images (based on the undistortion parameters estimated by Bundler). Finally, included in the bin directory is the 'jhead' program for reading Exif tags from JPEG images. Very special thanks to Matthias Wandel for putting this useful program in the public domain.
IV. Before you beginYou'll first need to download the Bundler distribution from:http://phototour.cs.washington.edu/bundler/ (the binary distribution is highly recommended) and extract it into a directory (to be referred to as BASE_PATH). You'll also need ImageMagick installed on you system (for converting jpg files to pgm format, required for David Lowe's SIFT binary). In addition, you'll need a feature detector to get the system working. Assuming you will be using SIFT features generated by David Lowe's SIFT binary, you'll need to get SIFT from http://www.cs.ubc.ca/~lowe/keypoints/ and copy it to BASE_PATH/bin (making sure it is called 'sift', or 'siftWin32.exe' under Windows). The RunBundler.sh script relies on bash and perl being installed. The easiest way to run this script in Windows is through cygwin. Finally, copy the approximate nearest neighbors (ANN) shared library at BASE_PATH/lib/libANN_char.so to a location in your LD_LIBRARY_PATH (or add BASE_PATH/lib to LD_LIBRARY_PATH.
V. Running bundlerThe easiest way to start using Bundler is to use the included bash shell script, RunBundler.sh. Simply execute this script in a directory with a set of images in JPEG format, and it will automatically run all the steps needed to run structure from motion on the images (assuming everything goes well). As mentioned above, you'll first need to edit this script to set the BASE_PATH variable appropriately (again, you'll also need to edit set the BASE_PATH variable in the Perl script BASE_PATH/bin/extract_focal.pl and the bash script BASE_PATH/bin/ToSift.sh). To test this script, try running it from one of the example directories (e.g. examples/ET/before).The 'bundler' exectutable is actually the last in a sequence of steps that need to be run to reconstruct a scene. RunBundler.sh takes care of all these steps for you, but it's useful to know what's going on. The main initial steps are to generate features and pairwise feature matches for the image set. Any type of image features can be used, but Bundler assumes the features are in the SIFT format, and so David Lowe's SIFT detector, (available at http://www.cs.ubc.ca/~lowe/keypoints/) is the easiest to get working with Bundler (RunBundler.sh assumes that SIFT is used). A list of images containing estimating focal length information also must be created. The four steps to creating a reconstruction are therefore:
Bundler itself is typically invoked as follows: > bundler list.txt --options_file options.txt The first argument is the list of images to be reconstructed (created with the 'extract_focal.pl' utility). Next, an options file containing settings to be used for the current run is given. RunBundler.sh creates an options file that will work in many situations. Common options are described later in this document.
VI. Output format and scene representationBundler produces files typically called 'bundle_*.out' (we'll call these "bundle files"). With the default commands, Bundler outputs a bundle file called 'bundle_<n>.out' containing the current state of the scene after each set of images has been registered (n = the number of currently registered cameras). After all possible images have been registered, Bundler outputs a final file named 'bundle.out'. In addition, a "ply" file containing the reconstructed cameras and points is written after each round. These ply files can be viewed with the "scanalyze" mesh viewer, available at http://graphics.stanford.edu/software/scanalyze/. There are several other viewers that also can read ply files (as scanalyze can sometimes be difficult to compile under Linux). These include Meshlab and Blender (where you can use File->Import->PLY to open a ply file---thanks to Ricardo Fabbri for the tip).The bundle files contain the estimated scene and camera geometry have the following format:
# Bundle file v0.3 <num_cameras> <num_points> [two integers] <camera1> <camera2> ... <cameraN> <point1> <point2> ... <pointM> Each camera entry <cameraI> contains the estimated camera intrinsics and extrinsics, and has the form: <f> <k1> <k2> [the focal length, followed by two radial distortion coeffs] <R> [a 3x3 matrix representing the camera rotation] <t> [a 3-vector describing the camera translation] The cameras are specified in the order they appear in the list of images.
Each point entry
The view list begins with the length of the list (i.e., the number of
cameras the point is visible in). The list is then given as a list of
quadruplets <camera> <key> <x> <y>,
where <camera> is a camera index, <key>
the index of the SIFT keypoint where the point was detected in that
camera, and <x> and <y> are the detected
positions of that keypoint. Both indices are 0-based (e.g., if camera
0 appears in the list, this corresponds to the first camera in the
scene file and the first image in "list.txt"). The pixel positions
are floating point numbers in a coordinate system where the origin is
the center of the image, the x-axis increases to the right, and the
y-axis increases towards the top of the image. Thus, (-w/2,
-h/2) is the lower-left corner of the image, and
(w/2, h/2) is the top-right corner (where w
and h are the width and height of the image).
We use a pinhole camera model; the parameters we estimate for each
camera are a focal length (f), two radial distortion parameters
(k1 and k2), a rotation (R), and translation
(t), as described in the file specification above. The formula
for projecting a 3D point X into a camera
(R, t, f) is:
where P.z is the third (z) coordinate of P. In the last
equation, r(p) is a function that computes a scaling factor to
undo the radial distortion:
This gives a projection in pixels, where the origin of the image is
the center of the image, the positive x-axis points right, and the
positive y-axis points up (in addition, in the camera coordinate
system, the positive z-axis points backwards, so the camera is looking
down the negative z-axis, as in OpenGL).
Finally, the equations above imply that the camera viewing direction
is:
(where ' indicates the transpose of a matrix or vector).
and the 3D position of a camera is
Other options. There are a number of other useful
options in addition to the default ones listed above, including:
Thanks to Manolis Lourakis and Antonis Argyros for their
sparse bundle
adjustment package, to David Lowe
for SIFT , to
David M. Mount and Sunil Arya for
their approximate nearest
neighbors library, and to Matthias Wandel for his
excellent 'jhead' program.
Special thanks as well to Kathleen Tuite and Sebastian Koch for
testing this distribution.
|