REAL 5.0 User Manual S. Keshav Cornell University skeshav@cs.cornell.edu August 13th 1997
This manual provides a tutorial introduction to setting up a simulation in REAL. If you want to modify REAL you should read also read the REAL Programmers Manual.
Contents
1. Introduction
REAL is a simulator for studying the dynamic behavior of flow and congestion control schemes in packet switch data networks. It provides users with a way of specifying such networks and to observe their behavior. Source code is provided so that interested users can modify the simulator to their own purposes. In my experience, anyone who uses a simulator ends up modifying it anyway, so I expect that users are capable of reading and understanding source code. This manual, and the programmers manual, are only a brief sketch of what is available, and what is possible with REAL.
The simulator takes as input a scenario which is a description of network topology, protocols, workload and control parameters. It produces as output statistics such as the number of packets sent by each source of data, the queueing delay at each queueing point, the number of dropped and retransmitted packets and other similar information.
Scenarios are described using using NetLanguage, a simple ascii representation of the network. The network is modeled as a graph, where nodes(vertices) represent either sources or sinks of data, or gateways (gateways are used synonymously with routers, bridges and switches to mean elements that route, buffer and schedule packets).
The interconnection between the nodes is the topology. You have to specify some network wide parameters, the transport protocol (in particular, the flow control) and the workload at each source. Finally, you must specify control parameters such as the latency and bandwidth of each communication line, the size of trunk board buffers, packet sizes etc.
The simulator files are in a directory called sim. This has four subdirectories - docs, sim, src and results. sim/docs contains online versions of the manuals. sim/sim has source code for REAL. sim/src has source code for NEST, the simulation package underlying REAL. sim/results stores the results of simulation (it is initially empty).
sim/sim in turn contains some subdirectories described by sim/MANIFEST.
3. Sources, Gateways and Sinks
In REAL, each user is modeled as a source of data regulated by a flow control protocol.
The combination of a workload and flow control protocol is implemented by a single C function. Each such function is executed in parallel by the underlying thread-based simulation package, and can be thought of as being an independently scheduled and non-preemptable entity (see NEST manuals for more details).
3.1 Source types
There are nearly 30 source types, corresponding to 30 or so transport protocol and workload types. The sources can be categorized into one of two types: flow-controlled and non-flow-controlled data sources. Flow-controlled sources use acknowledgements and timeouts to implement a reliable transport layer functionality over a lossy network. Non-flow-controlled sources generate data either from a known distribution or from a trace and do not provide a reliable transport functionality.
Except for the telnet source, most flow controlled sources assume that the users have a finite number of data packets ready to transfer, and that this data is available for transmission without any delay. So, whenever the flow control protocol allows data transmission, a packet to transmit would always be available. The telnet source assumes that users generate single packets with an exponential inter-packet sleep time.
This implements window flow control with a fixed size window. This is similar to TCP before the dynamic window modifications by Jacobson and Karels.
A telnet workload is similar to a Poisson workload, but differs from it in two ways
3.1.3 jk_reno and jk_tahoe
These sources implement TCP flow control with the window adjustment and other modifications described by V. Jacobson, M. Karels, P. Karn and C. Partridge (V. Jacobson, "Congestion Avoidance and Control", Proc. ACM SigComm '88; P. Karn, C. Partridge,"Improving Round-Trip Time Estimates in Reliable Transport Protocols", ACM Trans. on Computer Systems, V9 No. 4, November 1991).
jk_tahoe implements the flow control in 4.3BSDTahoe. The window size can increase or decrease - decreases cause the window to be shut down to 1. Duplicate acks cause retransmission.
jk_reno incorporate further modifications made to TCP by V. Jacobson (as of Aug '90).
When duplicate acks are seen, the window shuts down to half
its previous value, not to 1. Moreover, during fast retransmission, each incoming ack
increases the flow control window by one.
3.1.4 dec
This implements the DEC congestion bit based window adjusting scheme [RJ 90]( K.K. Ramakrishnan and R. Jain, "A Binary Feedback Scheme for Congestion Avoidance in Computer Networks", ACM Trans. Comp. Sys. May 1990, V8 N2:158-181).
3.1.5 pp
The simulator now has code for packet-pair flow control. The work is described partly in my Sigcomm'91 paper ("A Control theoretic approach to flow control"), and the code implements all the details described in my unpublished (and very long) paper "Packet pair flow control" as well as my recent work on retransmission described in "SMART: Retransmission: Performance with Random Losses and Overload". The pp.c source provided implements a transport layer that does both flow control and error control. Error control is with SMART retransmission, but no flow control. The pp.c source can also act like a 'transport layer', expecting another node to provide it with packets that it sends to the destination. When the option is set, each source requires a pair of nodes: one generates traffic from an arbitrary distribution (or reads it from a trace file), and the pp node performs error control, and optionally, flow control. This provides a very flexible environment for testing out packet-pair flow control in a wide range of scenarios.
The non-flow-controlled sources send data according to some parameterized distribution. The sources in this category are:
3.1.6 poisson
This is a poisson source that sends packets no faster than the bandwidth of the output line. Thus, the source is \fIclamped\fP to that speed. This source is useful in simulating cross traffic, and in validating queueing models.
3.1.7 background
This node is used to create and remove bottlenecks. The
background source sends data at a chosen rate for some period of time, then is idle for
some time, and repeats this cycle. The fraction of the bottleneck to use is governed by
the average bandwidth parameter in the scenario file. The source can start at any phase in
the cycle, can send data with some randomness in the sending rate Some parameters for this
source are set from the file background.c.
3.1.7 controlled_rate
This is an on_off source that requests resources using a SETUP packet. The parameters for the source are on_time, off_time and peak_rate, described below in the `REAL parameters' section.
3.1.8 random_rate
This source sends packets so that the sending rate is no faster than some peak bandwidth, and the average sending rate is some specified average. These parameters are described through the scenario file. Random_rate sources also do call setup.
3.1.9 mmpp
This tries to emulate a Markov-modulated Poisson process data source. It is not an accurate model, but adequate for creating bursty cross traffic.
The table below summarizes these, and some other sources available in REAL v5 but not described above.
FILE |
DESCRIPTION |
background.c | sends background traffic on-off at a nominal rate |
coder.c | MPEG coder
that adapts its rate to the currently available rate (needs pp.c as transprt layer) |
controlled_rate.c | sends on-off traffic and also does call setup |
dec.c | obeys DECbit flow control |
ecn_error.c | template for error control assignment |
ecn_flow.c | template for flow control assignment |
ecn_master.c | simulates shared channel for CSMA/CD (Ethernet) |
ecn_receiver.c | generic receiver for assignment 3 |
ecn_router.c | really tiny router, but it works |
ecn_sender.c | sources to test the router |
ecn_simple.c | template for simple source assignment |
ecn_slave.c | simulates Ethernet card for CSMA/CD |
ftp_vegas.c | untested code for FTP with TCP vegas flow control |
generic.c | generic transport layer with windowing, timeouts, acks |
jk_reno.c | implements flow control scheme of 4.3BSD-Reno |
jk_tahoe.c | implements flow control in 4.3BSD Tahoe |
mmpp.c | something like a markov modulated poisson source |
mpeg.c | uncontrolled mpeg coder (reads values from file) |
onoff.c | ON OFF source |
onoff_closed.c | ON OFF source, off time starts when all packets generated during on time have been acked |
onoff_closed_gbn.c | same as above, but does go-back-n retransmission |
playout.c | plays MPEG video; complains if a frame misses its end-to-end delay bound |
poisson.c | poisson source |
pp.c | packet pair flow control |
random_rate.c | sends data at random intervals, but conforms to peak/ave description |
send.c | implements packet transmission functions |
sink.c | universal receiver function |
telnet.c | flow controlled poisson source (with windowing) |
trace.c | reads a trace from a file and sends packets according to that description |
3.2 Gateway
The router implements several scheduling disciplines including:
3.2.1 First come first served
3.2.2 Fair queueing
3.2.3 fqbit
3.2.4 hrr
3.2.5 decbit
3.3 Sink
sink.c implements a sink that receives data from any source. If the source is a guaranteed service host, the sink only collects statistics, and does not send an acknowledgment. Otherwise, it keeps track of the highest in-sequence packet received thus far, and sends an acknowledgment with that sequence number.
player.c implements a video player. It keeps track of received frames, and complains if a frame is received after the delay bound.
I now describe NetLanguage, a user-friendly way of describing network scenarios. Refer to sim/lang/example or to the Appendix for an example.
A NetLanguage file starts with a heaer. The header contains identification information for your own use. It does not affect simulation in any way. This is followed by Nest parameters.
passtime
maxnodes
monitor
You now have to set REAL parameters. Here is a brief explanation of each field. Grep for the parameter in the source files to see how it is used.
ack_size
random_seed
buffer_size
telnet_pkt_size
ftp_pkt_size
ftp_window, telnet_window
decongestion_mechanism
policy
router, real_number
Used in distributed simulation. See the description on distrib/README.
end_simulation
print_interval
hrr_levels
scale_factor
Next, node and edge parameters are defined. To make it easier to declare the nodes, the first declaration in the block is the default declaration. The default defines the parameters of each node unless they are overridden by an explicit redeclaration inside some node. You can skip any entries in the defaults declaration if the simulation does not use them, and you may skip as many entries in the node declaration as you wish.
The function field is the name of the C function that should run on that node. `dest' is the ID of the sink to which data should be sent. `start_time' is the simulation time at which the node becomes active. You should set plot to true if you want the node function at that node to generate plots (described later). The other values are self explanatory, and are commented in sim/lang/example.
Edges are defined by source, destination, bandwidth, the latency in communication, and the loss rate. Losses are specified by the probability of bit corruption, probability of a loss burst, and the mean size of a loss burst. If the loss burst size isn't 1, then the length of a loss burst is chosen from an exponential distribution with mean set to the parameter 'loss burst size'. If the loss burst size is set to 1 (special case) then all losses happen as singletons.
The endpoints of an edge are specified as
source->destination, and all edges are bidirectional. The bandwidth is specified in
bits per second, and the delay in microseconds.
4.2 A note on usage
Usually, you will not need to change the nest parameter and
function declaration sections of the language file. In most cases, you should just copy
these from the file sim/lang/example.l. It is necessary to give a .l extension to a
language file. It is often useful to declare default values in a file, and to include them
in the language file, particularly when several simulation runs are batched. One trick is
to create a .s (for small) file that #includes some default files. The `asm' script in
sim/sim/scenarios will assemble a .l file from a .s file, and may be of use. Finally, the
results of the simulation will be placed in a subdirectory of the current working
directory.
If you plan to use the GUI, read the instructions in the RealEdit home page. Otherwise, make sure that FUNC_TABLE exists in the current directory. Then type
This runs the simulation on the language file, and optionally connects to a gui waiting on the specified port.
5.1 Simulation output
The simulator produces two kinds of outputs a) time ticks and c) a simulation report. Time ticks are printed out every `print_interval' seconds of simulated time. They are just to show you that something is going on. Reports are generated every `print_interval' seconds. They are appended to a result file. The report file resides in the current working directory concatenated with the input filename without the .l. For example, if you are in director home/sim, and you run the simulator on file input.l, then the results will be in file home/sim/input. There are a couple of options: you can type in the input file without the .l if you wish, and REAL will complete the file name for you (of course, the file itself must exist with the .l). The report file is named `dump'. For example, if your input is input.l, then the report will be in home/sim/input/dump. Also, if the input file is supplied through indirection, then the standard output file defaults to `../results/dump'.
The report is in the form of table and will look like this:
-------------------------------------------------------------------------------------- Time # Type G/w Xmit Q'ing(min,ave,max)[#] Drops Retx RTT(min,ave,max) -------------------------------------------------------------------------------------- 300 1 Generic 5(FC) 1434 0.29 0.93 1.19 [4] 0 1 0.00 1.04 1.30 2 Generic 5(FC) 1436 0.10 0.93 1.19 [5] 0 0 0.00 1.04 1.30 3 Telnet 5(FC) 63 0.20 0.99 1.25 [0] 0 1 0.00 1.10 1.36 4 Telnet 5(FC) 67 0.00 0.97 1.16 [0] 0 1 0.00 1.08 1.27 600 1 Generic 5(FC) 1443 0.89 0.93 1.19 [5] 0 0 1.00 1.04 1.30 2 Generic 5(FC) 1441 0.89 0.93 1.19 [4] 0 0 1.00 1.04 1.30 3 Telnet 5(FC) 55 0.89 0.97 1.12 [0] 0 0 1.00 1.08 1.23 4 Telnet 5(FC) 61 0.89 0.98 1.18 [0] 0 0 1.00 1.09 1.29 Summary Node G/w T'put(mean,var) Q'ing(mean,var) RTT(mean,var) Drops(mean,var) Retxs(mean, var) 1 5 (1437.17 4.32 ) (0.93 0.00 ) (1.04 0.00 ) (0.00 0.00 ) (0.00 0.00 ) 2 5 (1437.33 4.37 ) (0.93 0.00 ) (1.04 0.00 ) (0.00 0.00 ) (0.00 0.00 ) 3 5 (58.17 3.62 ) (0.98 0.01 ) (1.09 0.01 ) (0.00 0.00 ) (0.00 0.00 ) 4 5 (67.33 8.42 ) (0.98 0.00 ) (1.10 0.02 ) (0.00 0.00 ) (0.17 0.37 )
Every print_interval, a report of simulation statistics is printed. The first entry is the time (in this example, 300 seconds). For each source, the following data is printed:
#
Type
G/w
Xmit
Q'ing
Drops
Retx
Rtt
In order to avoid corrupting simulation results with data from transients that happen at start up, simulation statistics are flushed after each dump. Thus, the report printed each print interval presents statistics only for the last
print_interval seconds.Note that to compute effective throughput you must subtract the number of retransmissions from the number of packets sent by the gateway. The effective load is the number of packets transmitted by the gateway plus the number of packets it dropped.
If print_interval is long enough, the mean value printed out at the end of the interval will be a `large enough' sample of the path of a variable, and so can be thought of as a batch mean. From elementary statistics, we know that the batch mean will be normally distributed. REAL computes both the mean of batch means, and the standard deviation of these means. Using the normal assumption, 99% confidence will be achieved at 3 SD on either side of the mean of means. At the end of the simulation, the mean of batch means and SD of batch means is printed. This is headed by a line marked `Summary'. Then, for each value in the throughput, the queueing delay at each gateway, the number of packet losses from each source in each interval and the number of retransmissions in each interval are the mean of means and SD of means is printed. To avoid start-up transients from affecting the variance value, the parameter NSKIP #defined in table.c selects how many of the initial time periods to skip. The default value is 1.
5.2 Plotting simulation variables
Node functions in REAL have been instrumented to produce traces of certain variables as the simulation proceeds. For example, TCP sources print the size of the congestion control window whenever the congestion window could change. Plot values generated in this way are stored in a buffer, and when the buffer reaches a certain size (512 bytes by default), the buffer is either dumped to a file called 'plot' or sent to the GUI. If you aren't using the GUI, extract the separate files from the single plot file with the sim/kernel/demux utility. (time, value) tuples in plot files can then be post processed to derive other statistics, such as average values.
The files that demux creates are of the form CCCxx, where CCC is an identification tag and xx is the node ID of the node which is producing the plot. The files appear in the same directory as the report file (see the description above). The currently available files are
Rtt
tao
win
e2e
seq
In aplot file, the X axis (first element of the tuple) is time in seconds, and the Y axis is the data value. You can view plots with graphing packages such as ggraph or xgraph.
To add your own instrumentation, you have to edit the node functions in sim/sources. Choose the variable that you want to plot, and at the appropriate points, insert the function call `make_plot(filename, variable);'. For example, to plot the window size, the command I use is `make_plot("win", cur_window);'. To plot real numbers, use `make_flt_plot()'. Recompile the simulator and the plot files will be produced automatically for each node that has the plot option set in the NetLanguage input. Use `make clean' to remove old plot files.
6. A Note on REAL Configuration
7. Acknowledgements
S. McCanne at Lawrence Berkeley Lab contributed some code for jk_reno.c and plotting.c, and reorganized the code in jk_tahoe.c. I have reimplemented some features based on his additions to REAL. S. Jamin at USC added the ability to specify result files. K.K. Ramakrishnan at DEC helped with the testing of DEC sources. R. Sethi at Bell Labs suggested the use of defaults in NetLanguage. D. Ferrari at UCB provided advice on statistics and other features. Last but not least, S. Shenker at Xerox PARC was the main instigator of REAL, and has provided invaluable advice over the years