Bruce Randall Donald
Associate Professor
brd@cs.cornell.edu
Graphics/Multimedia Research and Physical Geometric Algorithms
This document outlines a vision for Graphics/Multimedia research, and
in particular tries to define what the science base for this research
would be. The latter, I believe is necessary, in order to prescribe
the role that Multimedia should have in a Computer Science department.
In particular, the impact of Multimedia on communication and society
in the future is, by itself, not sufficient justification for
organizing this activity in CS. To see this, consider the analogy of
television. While television reaches an enormous number of people, we
do not have professors of television. Instead, we must argue that
there is something about Multimedia which especially suits our
algorithmic and software design methodologies. In particular, I claim
that Multimedia research provides a wealth of geometric and
algorithmic problems in the domain of physical geometric algorithms, which is
our primary research area.
This research agenda, while broad, is targeted to a particular subset
of Multimedia, and naturally many interesting topics exist outside
this framework.
What is to be Done?
Our research in Multimedia concentrates in several areas:
- Authoring tools. Multimedia content can
currently be played by millions but authored by few. We are
working to develop authoring tools, particularly for animation, that
will greatly expand the authoring population.
- Direct manipulation. Multimedia content
should be editable and extensible through intuitive user interfaces,
such as directly tugging on a graphic or animation through a
graphical input device.
- Haptics. It should be possible to provide
haptic interfaces for direct manipulation of computer graphics, and
to manipulate/navigate through animations, sound spaces, the Web, and
authoring environments. Haptics should be explored both as a novel
input device, and as a method for expressing content (shape, texture,
and design choices).
- Interoperability. Multimedia content should be
interoperable: it should be possible, for example, to define
clip-characters and clip-motions, and paste one onto another.
Characters and motions should be reusable and editable.
- Seamless interfaces. Multimedia systems should provide a
seamless, orthogonal interface to graphics, animation, audio,
haptics, and virtual reality.
- Active objects. The user should experience Multimedia
content using active objects, including servoable cameras and
sensors, which can also take capture data from the user to influence
the Multimedia presentation (e.g. to control an avatar). Multimedia
playback should be an immersive, interactive experience, in which a
virtual reality system can manipulate the user's physical environment
to generate realistic effects.
Previous Work: Progress So Far
At Cornell University, our initial efforts (joint work with Jed
Lengyel, Mark Reichert, and Don Greenberg) in this area have concerned
model-based geometric algorithms for animation and Multimedia. We
first investigated methods for generating animations from very
high-level scripts. The method is applicable whenever the script has
a geometric encoding. We developed a method for
automatically rendering animation sequences of moving objects without
keyframes. The user specifies the start and goal for each object and
the motions (which may necessarily be quite complex) are synthesized
automatically. We use configuration-space (motion-planning)
algorithms to quickly synthesize object motions subject to kinematic
constraints. Motion synthesis for multiple moving objects is
possible, and the action may be synchronized to music. This work,
together with a video entitled ``Enchanted Furniture,'' was presented
at SIGGRAPH.
This research represents the first attempt to employ motion-planning
algorithms for animation. Since our paper, this has become a fairly
visible area in animation and computer graphics. Several startup
companies, including Jean-Claude Latombe's The Motion Factory, Inc.,
base their core technologies on these concepts; there is also
significant activity in this area at some larger research labs
(probably, Microsoft Research).
In joint work with Amy
Briggs, we have also pursued algorithms for automatic camera
control, with applications to the generation of Multimedia content,
and to teleconferencing. We focused on the problem of controlling a
group of cameras that record an event (eg., a lecture or a meeting) so
as to obtain a video stream documenting the event. One problem in
automatic generation of video for teleconferencing is camera placement
and control. We have developed some tools for task-directed
and visually-cued control of camera motions.
Task-directed camera control is specified as "Given a geometric
region we wish to monitor, move the camera to observe all activity in
the region." The system takes into account occlusion relationships to
guarantee partial or total visibility of the surveillance area.
Visually-Cued camera control waits until a particular visual
event is observed before switching camera control strategies. For
example, one can program a camera to wait until a particular speaker
comes through a doorway, and then to intercept and follow (track) the
target. In particular, the event may a cue non-visual
control strategy (for example, a physical motion, virtual motion, or
a pure computation).
We have developed algorithms that take steps towards solving these
problems, and that provide a framework for posing and analyzing such
control strategies. Our three papers with Amy Briggs describe a
system we built to demonstrate these concepts, and a video of the
system working. This work invites discussion on online vs. offline
approaches to automatic video editing and control. The last arises
when considering the difference between online transmission of a
lecture vs. offline editing for an archival copy.
I spent 1994-7 in Silicon Valley at Interval Research Corporation
working with Tom Ngo to develop advanced pre-competitive technologies
in graphics, animation and multimedia. We hope to
collaborate with Interval in the future as well, on some of the
projects below.
Current Projects: Research Topics
This section describes a cluster of Multimedia research topics I'd
like to explore in the future, building on the work described
above.
The goal of this research would focus, broadly, on reusable,
recyclable, interoperable, editable, extensible multimedia authoring
and playback tools, specifically computer graphics and image content.
This research would focus on the following topics:
- Converting capture data from humans (e.g.,
still images, video) into computer graphics format (splines, control
points, texture maps) with auto-correspondence, to permit direct
manipulation (editing), morphing, and interpolation. We are working in
collaboration with Professor Ramin Zabih in
the
Cornell Robotics and Vision Laboratory.
- Human capture data can be used to drive animations
with phenomenonally good results. The resulting motion is often far
more realistic than that obtained by key-framing or using
physically-based simulation. Here is an example:
Not only is the Kalman filter used to recover the motion, but the
motion estimation is used to refine the biometric model.
Biometrics is a company based in Santa Clara, CA.
- We are also exploring the use of machine vision techniques for
obtaining a model of human body motion. Here is a paper by a student
who worked with me and Professor Ramin Zabih in
the
Cornell Robotics and Vision Laboratory:
Haptic interfaces to animation and video. Novel methods for
direct manipulation of animations, using haptic interfaces. Using
haptics to manipulate, edit, and author animations. Editing/morphing
of bundles of trajectories.
Real-time, parametric X, where X = graphics, 3d graphics,
texture, morphing...
Design of high-dimensional splines, shapes, volumes, surfaces,
and kinematic maps.
Topological data-structures for polyhedral, simplicial, and CW
complexes, particularly in high dimensions. Tools for authoring,
editing, visualization, topological verification, and computation of
geometric and topological properties.
References
- Real-Time Robot Motion Planning Using Rasterizing Computer
Graphics Hardware, (with J. Lengyel, M. Reichert, and D. Greenberg),
Proc. SIGGRAPH '90, Dallas, TX (Aug 1990), pp. 327-336.
-
Visibility-Based
Planning of Sensor Control Strategies, (with A. J. Briggs),
submitted to Algorithmica, Special Issue on Algorithmic Foundations of
Robotics, (1996).
-
A.J. Briggs and B. R. Donald, Robust
Geometric Algorithms for Sensor Planning, International
Workshop on the Algorithmic Foundations of Robotics, Toulouse,
France (1996).
-
Automatic Sensor Configuration for
Task-Directed Planning (with
Amy Briggs), Proceedings 1994
IEEE International Conference on Robotics and Automation, San Diego,
CA (May 1994).
- System for Image Manipulation and Animation Using Embedded
Constraint Graphics (with J. T. Ngo) patent application filed August
5, 1996.