A classic concurrent programming design pattern is producer-consumer, where processes are designated as either producers or consumers. The producers are responsible for adding to some shared data structure and the consumers are responsible for removing from that structure. Only one party, either a single producer or a single consumer, can access the structure at any given time.
Here we consider an example with a shared queue, using a mutex (introduced previously) to protect the queue:
let f = Queue.create() and m = Mutex.create ()
We divide the work of a producer into two
parts: produce
which simulates the work of creating a
product and store
which adds the product to the shared
queue. produce
increments the counter p
that
it is passed, sleeps for d
seconds to simulate the time
taken to produce, and outputs a status message:
let produce i p d =
incr p;
Thread.delay d;
print_string("Producer " ^ string_of_int(i) ^
" has produced " ^ string_of_int(!p) ^ "\n");
flush stdout
store
acquires the mutex m
, adds to the
shared queue, outputs a status message and releases the mutex:
let store i p =
Mutex.lock m ;
Queue.add (i,!p) f ;
print_string("Producer " ^ string_of_int(i) ^
" has added its " ^ string_of_int(!p) ^ "-th product\n");
flush stdout;
Mutex.unlock m
The producer
loops n
times
calling produce
and then store
, and then
sleeping for a random amount of time up to 2.5 seconds. When done
looping it outputs a status message.
let producer (n,i) = let p = ref 0 and d = Random.float 2. in for j = 1 to n do produce i p d ; store i p ; Thread.delay (Random.float 2.5) done; print_string("Producer " ^ string_of_int(i) ^ " is exiting.\n"); flush stdout
The consumer
loops n
times, acquiring the
mutex m
, then attempting to take an item from the shared
queue. If it succeeds it prints out the item, if not it prints out
that it failed to get an item. In either event it unlocks the mutex
and then waits a random amount of time upt to 2.5 seconds. When done
looping it outputs a status message.
let consumer (n,i) = for j = 1 to n do Mutex.lock m ; ( try let ip, p = Queue.take f in print_string("Consumer " ^ string_of_int(i) ^ " has taken product (" ^ string_of_int(ip) ^ "," ^ string_of_int(p) ^ ")\n"); flush stdout with Queue.Empty -> print_string("Consumer " ^ string_of_int(i) ^ " has returned empty-handed\n"); flush stdout); Mutex.unlock m ; Thread.delay (Random.float 2.5) done; print_string("Consumer " ^ string_of_int(i) ^ " is exiting.\n"); flush stdout
This use of mutual exclusion is very coarse grained. For instance
it would be better to be able to have a consumer wait until something
is in the queue, rather than returning empty handed. For this we can
make use of condition variables, which were introduced previously. Now the store
function, store2
signals the condition c
to
indicate that something is in the queue.
let c = Condition.create () let store2 i p = Mutex.lock m ; Queue.add (i,!p) f ; print_string("Producer " ^ string_of_int(i) ^ " has added its " ^ string_of_int(!p) ^ "-th product\n"); flush stdout; Condition.signal c ; Mutex.unlock m
Now we split the consumer function, consumer2
, into
two parts, wait
and take
.
let wait i = Mutex.lock m ; while Queue.length f = 0 do print_string("Consumer " ^ string_of_int(i) ^ " is waiting\n"); flush stdout; Condition.wait c m done let take i = let ip, p = Queue.take f in print_string("Consumer " ^ string_of_int(i) ^ " has taken product (" ^ string_of_int(ip) ^ "," ^ string_of_int(p) ^ ")\n"); flush stdout; Mutex.unlock m let consumer2 (n,i) = for j = 1 to n do wait i; take i; Thread.delay (Random.float 2.5) done; print_string("Consumer " ^ string_of_int(i) ^ " is exiting.\n"); flush stdout
Note that wait
locks the mutex, so no work should be
done after wait returns unless it is intended to be in the critical
section. Generally, a function that has the effect of locking (or
unlocking) a mutex should be used with caution, and should be clearly
documented as doing so.
A thread pool consists of a collection of threads, called workers, that are used to process work. Each worker looks for new work to be done, when it finds work to do it does it, and when finished goes back to get more work. The workers play the role of consumers in the producer-consumer model that we just considered above. In fact, thread pool implementations often use a shared queue to store the work, thus building quite directly on the previous example. Before considering implementation of thread pool, let's get a better idea of what it does and where it is useful.
The basic operations for a thread pool are to create a new thread pool with some specified number of workers, to add work to an existing thread pool (which will subsequently be performed by one of the workers), and to destroy an existing thread pool (shutting it down once all previously added work is complete). Here is the signature for a basic thread pool:
module type SIMPLE_THREAD_POOL = sig type pool (* A No_workers exception is thrown if addwork is called when the threadpool is being shut down. The work is not added. *) exception No_workers (* create a thread pool with the specified number of worker threads *) val create: int -> pool (* add work to the pool, where work is any unit->unit function *) val addwork: (unit->unit) -> pool -> unit (* destroy a thread pool, stopping all the threads once all work * in the pool has been completed. *) val destroy: pool -> unit end
Thread pools are particularly useful in setting where work arrives
asynchronously, such as occurs with a server where many network
requests may need to be handled promptly. In such settings, a thread
receives an event such as a network request, adds the corresponding
work to a thread pool (which will be run at some point in the future),
and then quicly returns indicating to the caller that the request will
be handled. Sometimes it is also useful to have a handle associated
with each unit of work to which some value is sent. The simple
abstraction that we presented here does not have any means of
returning a result, as the functions representing work are of
type unit->unit
.
Here we consider an implementation of
the SIMPLE_THREAD_POOL
interface in terms of a 4-tuple: a
mutable counter, a mutable queue of functions that are the work
remaining to be done, a mutex, and a condition variable. The mutex is
used to protect the counter and the queue, and the condition variable
is used to signal when a worker should wake up to get new work.
module Tpool : SIMPLE_THREAD_POOL = struct type pool = (int ref * (unit -> unit) Queue.t * Mutex.t * Condition.t) exception No_workers let dowork tp = ... let create size = ... let addwork f tp = ... let rec done_wait tp n = ... let destroy tp = ... end
When the counter in the 4-tuple is a positive integer, then it indicates the number of worker threads that the thread pool was created with. When the counter is a non-positive integer then it indicates that the thread pool is being destroyed, and the absolute value of the counter is then the number of threads which have properly exited. This allows the destroy function to wait for the threads to finish their work and exit before returning.
Each worker thread runs the function dowork
. This
function is not exposed in the interface and so can only be called
from inside the implementation
of Tpool
. dowork
loops as long as the
thread pool is not finished, in which case it exits. A thread pool is
finished when it is being destroyed and there also no work remaining
to do. We use the counter in the 4-tuple, here
called nworkers
, to indicate that the thread pool is
being destroyed by setting its value to something less than 1. In
that case, if the queue of work is also empty then the thread exits as
the pool is finished. Otherwise, on each loop the worker waits for
work to do, and then takes that work from the queue, executing it
inside a try to ensure that unhandled exceptions in the
work do not cause the worker to exit.
let dowork (tp:pool) = match tp with (nworkers, q, m, c) -> Mutex.lock m ; (* When nworkers <=0 it means the thread pool is being * destroyed. If that is true and there is also no work left to do * then stop looping and drop through to exit processing.*) while (!nworkers > 0) || (Queue.length q > 0) do (* In normal operation where nworkers>0 wait for stuff in the queue. *) while (!nworkers > 0) && (Queue.length q = 0) do dbgprint "waiting"; Condition.wait c m done; (* Verify something in the queue rather than we are now being * shut down, then get the work from the queue, unlock the * mutex, do the work and relock the mutex before looping * back. *) if (Queue.length q > 0) then let f = Queue.take q in dbgprint "starting work"; Mutex.unlock m; (* Don't let an exception in the work, f, kill the thread, * just catch it and go on. *) (try ignore (f()) with _ -> ()); Mutex.lock m done; (* A worker thread exits when the pool is being shut down. It * decrements the worker count which when all threads are * finished should be -n, where n was the number of threads in * the pool (counts down from 0). *) nworkers := !nworkers-1; dbgprint "exiting"; Mutex.unlock m
Note the use of the mutex in the 4-tuple, called m
here, to protect accesses to the queue and the counter. Before
entering the while loop, and at the end of the while loop (thus before
the next iteration of the loop), the mutex must be locked because both
the counter and the queue are accessed. The nested while loop
checking for work to do uses Condition.wait
to release
the mutex and sleep until it receives the condition of work being
ready. Recall that this reacquires the mutex before returning. It is
important that the mutex is then unlocked before
calling f
, the work to be done. This allows other
workers to safely run concurrently, as f
cannot access
the queue or counter. Then the mutex is reacquired before the end of
the while loop. When exiting the while loop, the mutex is already
locked, so the counter is simply decremented and then the mutex is
released before the worker exits.
The create
function makes a 4-tuple, and starts up the
specified number of threads, each of which runs
the dowork
function. create
simply acquires
the mutex at the beginning and releases it at the end. Other ways of
writing this code are possible, where the mutex is not held during the
entire process of creating the thread pool.
let create size = if size <1 then raise(Failure "Tpool create needs at least one thread") else let tp = (ref 0, Queue.create(), Mutex.create(), Condition.create()) in match tp with (nworkers, _, m, _) -> Mutex.lock m; while !nworkers < size do ignore(Thread.create dowork tp); nworkers := !nworkers+1 done; Mutex.unlock m; tp
The add_work
function adds the given function to the
queue. In doing so it first locks the mutex and checks whether the
pool is being destroyed. If the pools is being destroyed instead of
adding the work it throws the No_workers
exception, after
first releasing the mutex. If the pool is not being destroyed it adds
the qork to the queue, signals there is work to be done, and unlocks
the mutex.
let addwork (f:unit->unit) (tp:pool) = match tp with (nworkers, q, m, c) -> Mutex.lock m ; if !nworkers <1 then (Mutex.unlock m; raise No_workers) else (Queue.add f q ; Condition.signal c; Mutex.unlock m )
The destroy
function acquires the mutex and then sets
the number of workers to zero to indicate that the thread pool is
being shut down. It then broadcasts to all the workers that there is
something to be done, to ensure that all the workers exit including
ones that were currently sleeping awaiting work. Finally it releases
the mutex and then waits for the workers to all exit, using the helper
function done_wait
.
let destroy (tp:pool) = match tp with (nworkers, _, m, c) -> Mutex.lock m; let n = !nworkers in nworkers := 0; Condition.broadcast c; Mutex.unlock m; done_wait tp n