Multimedia Systems
Optimal Parallel MPEG Encoding


Jeffrey M. Moore
William L. Lee
Scott P. Dawson



MPEG Encoding is currently a very computation-intensive operation. The goal of Optimal Parallel MPEG Encoding is to derive a protocol to efficiently handle this computation while taking advantage of all available resources.

Overview


We will be implementing a RIVL command called seq_write_parallel, which will encode Movie Picture Expert Group (MPEG) videos using multiple machines, in parallel, over an ethernet network. To accomplish this, the video to be encoded must be divided at logical breaks in the video stream. These breaks will be determined by use of an existing scene detector that parses a video stream into smaller portions based on scene changes. Since the videos that we will be using (at first) will be videos of classroom lectures with slides, a slide change will most likely be detected as a scene change. Using the scene detector should prove to be trivial, as it is slated for inclusion in the next release of RIVL. The advantages of this implementation are three-fold: increased processing speed, improved quality, and improved compression.

Problems


The current implementation of the seq_write_parallel function in RIVL has a few problems. First, it assumes that a shared file system is being used, therefore limiting the function call only to the machines which have the file system mounted on it. The master (controller) workstation handles none of the processing during the sequence compression. Lastly, load balancing is not implemented. Each of these shortcomings will be handled by our implementation. This is by no means an exhaustive list of problems; we will no doubt encounter others throughout the course of the project.

Architecture


The encoding process will be handled by multiple programs: daemon and client programs. Both will consist of several modules, some of which have corresponding. Those modules that correspond to one another are asterisked in the ensuing outline.

Daemon

The daemon program is intended to be executed on all machines that will be candidates for processing.

Client

The client is intended to be executed on the controller machine. Its function will be to control and monitor the individual stages of the encoding process.

Processing Distribution


The task of selecting machines for processing is a complex one. The scene detector will parse the video into a set of scenes, each of which will need to be scheduled for processing on a different machine. Ideally, large scenes should be scheduled on machines that are either faster or are undergoing little processing activity. Smaller scenes should be scheduled on machines where overhead is already substantial, or processing speed is slower than others. Once the data is returned from the cut detector, we can then use the data in the `Build Hosts' file to pair scenes and machines together.

Mechanisms

We will be using Tcl/DP to initiate processes on client machines. We will be programming a daemon to be run on the console machine, the function of which will be to make remote procedure calls to establish client / server relationships.

UNIX Sockets

One possible solution is to program the UNIX sockets, which are universally available on all platforms. This could supply spectacular performance gains if programmed carefully. However, due to the complexity of programming UNIX sockets, this solution will be placed at the bottom of the project agenda.

Tcl/DP

This is an easy, portable method for writing the client and server portion of this application. It is portable to any platform which Tcl/DP runs on. Basically, Tcl/DP allows a script on a console machine to allocate a TCP socket for the purposes of client I/O. Through this socket, a client can interact with the server that it is performing for.

Reconstruction

Reconstruction of the encoded scenes will be handled by the controller machine. The controller will maintain a list of each scene and the machine that it is being processed on. Therefore, reconstruction will involve accessing this table when video is returned to determine its location within the final MPEG video. From the client-side, the name of the machine to send the processed video back to is already determined by the RPC Server/Client relationships as established in Tcl/DP. As an example:


FIGURE 1 - Client/Server Relationship

When CLIENT X has concluded processing, the information sent back to the controller will be the encoded MPEG video. No header is needed, as the controller will be able to piece the encoded MPEG back together based on the knowledge of which machine each scene was delivered from.

Compression


The data compression will be handled by the University of California, Berkeley MPEG video encoder. This is a freely distributable software MPEG encoder available on the Internet. Code modifications to the encoder is not necessary due to the fact that our program deals only with pre- and post-encode processing.

Load Balancing


Another portion of the project will deal with balancing of the workloads produced by the encoding computation. As mentioned previously, a `Build Hosts' file will be available that lists machines and their load thresholds. During encoding, if the load goes above the specified threshold for any machine, that machine will be removed from the available list of machines after the current video segment has run to completion. Changing the priority level of a process is also an option to reduce the amount of processing on that machine. This can all be handled by the daemon being run on each machine.

Milestones & Timeline


There are several clearly defined modules which make up this project. The completion of each of these modules constitutes the completion of a milestone. Each module is small enough to be easily handled by one or two people and will add another layer of useful functionality to the command.

Our goal is to complete one module per week per person. One extra week will be left for any clean-up, catching up, etc. This will enable us to complete the project well ahead of schedule.

Integration & Conclusion


Integration of these individual procedures should prove to be trivial. Spending a good deal of time in the design phase of the project will enable us to evaluate our implementation ideas before we actually spend time coding them. With good design, integration of the aforementioned ideas will be easily accomplished, and we can spend more time developing extensions and improvements for this project.

File last modified - 15 October 1995