A classic concurrent programming design pattern is producer-consumer, where processes are designated as either producers or consumers. The producers are responsible for adding to some shared data structure and the consumers are responsible for removing from that structure. Only one party, either a single producer or a single consumer, can access the structure at any given time.
Here we consider an example with a shared queue, using a mutex (introduced previously) to protect the queue:
let f = Queue.create() and m = Mutex.create()
We divide the work of a producer into two
parts: produce
, which simulates the work of creating a
product, and store
, which adds the product to the shared
queue. The produce
operation increments the counter p
that
it is passed, sleeps for d
seconds to simulate the time
taken to produce, and outputs a status message:
let produce i p d =
incr p;
Thread.delay d;
Printf.printf "Producer %d has produced %d" i !p;
print_newline()
The operation store
acquires the mutex m
,
adds to the
shared queue, outputs a status message and releases the mutex:
let store i p =
Mutex.lock m;
Queue.add (i, !p) f;
Printf.printf "Producer %d has added its %d-th product" i !p;
print_newline();
Mutex.unlock m
The producer
loops n
times. Each time
through the loop, it calls produce
and then store
,
then
sleeps for a random interval of time up to 2.5 seconds. When it is
done looping, it outputs a status message.
let producer (n, i) =
let p = ref 0
and d = Random.float 2. in
for j = 1 to n do
produce i p d;
store i p;
Thread.delay (Random.float 2.5)
done;
Printf.printf "Producer %d is exiting." i;
print_newline()
The consumer
loops n
times, acquiring
the
mutex m
, then attempting to take an item from the shared
queue. If it succeeds, it prints out the item. If not, it prints out
that it failed to get an item. In either event, it unlocks the mutex
and then waits a random interval of time up to 2.5 seconds. When it
is done looping, it outputs a status message.
let consumer (n, i) =
for j = 1 to n do
Mutex.lock m;
(try
let ip, p = Queue.take f
in
Printf.printf "Consumer %d has taken product (%d,%d)" i ip p;
print_newline()
with Queue.Empty ->
Printf.printf "Consumer %d has returned empty-handed" i;
print_newline());
Mutex.unlock m;
Thread.delay (Random.float 2.5)
done;
Printf.printf "Consumer %d is exiting." i;
print_newline()
This use of mutual exclusion is very coarse grained.
It would be better to be able to have a consumer wait until something
is in the queue rather than returning empty handed. For this we can
make use of condition variables, which were introduced previously. Now
the store
function store2
signals the condition c
to
indicate that something is in the queue.
let c = Condition.create()
let store2 i p =
Mutex.lock m;
Queue.add (i, !p) f;
Printf.printf "Producer %d has added its %d-th product" i !p;
print_newline();
Condition.signal c;
Mutex.unlock m
Now we split the consumer function consumer2
into
two parts, wait
and take
.
let wait i =
Mutex.lock m;
while Queue.length f = 0 do
Printf.printf "Consumer %d is waiting" i;
print_newline();
Condition.wait c m
done
let take i =
let ip, p = Queue.take f in
Printf.printf "Consumer %d has taken product (%d,%d)" i ip p;
print_newline();
Mutex.unlock m
let consumer2 (n, i) =
for j = 1 to n do
wait i;
take i;
Thread.delay (Random.float 2.5)
done;
Printf.printf "Consumer %d is exiting." i;
print_newline()
Note that wait
locks the mutex, so no
work should be
done after wait returns unless it is intended to be in the critical
section. Generally, a function that has the effect of locking or
unlocking a mutex should be used with caution, and should be clearly
documented as doing so.
A thread pool consists of a collection of threads, called workers, that are used to process work. Each worker looks for new work to be done. When it finds work to do, it does it, and when finished, it goes back to get more work. The workers play the role of consumers in the producer-consumer model that we just considered above. In fact, thread pool implementations often use a shared queue to store the work, thus building quite directly on the previous example. Before considering implementation of thread pool, let's get a better idea of what it does and where it is useful.
The basic operations on a thread pool are:
Here is the signature for a basic thread pool:
module type SIMPLE_THREAD_POOL = sig
type pool
(* A No_workers exception is thrown if addwork is called when the
threadpool is being shut down. The work is not added. *)
exception No_workers
(* create a thread pool with the specified number of worker threads *)
val create: int -> pool
(* add work to the pool, where work is any unit -> unit function *)
val addwork: (unit -> unit) -> pool -> unit
(* destroy a thread pool, stopping all the threads once all work
* in the pool has been completed. *)
val destroy: pool -> unit
end
Thread pools are particularly useful in settings where work arrives
asynchronously, such as occurs with a server where many network
requests may need to be handled promptly. In such settings, a thread
receives an event such as a network request, adds the corresponding
work to a thread pool (which will be run at some point in the future),
and then quickly returns indicating to the caller that the request will
be handled. Sometimes it is also useful to have a handle associated
with each unit of work to which some value is sent. The simple
abstraction that we presented here does not have any means of
returning a result, as the functions representing work are of
type unit -> unit
.
Here we consider an implementation of
the SIMPLE_THREAD_POOL
interface in terms of a 4-tuple: a
mutable counter, a mutable queue of functions that are the work
remaining to be done, a mutex, and a condition variable. The mutex is
used to protect the counter and the queue. The condition variable
is used to signal when a worker should wake up to get new work.
module Tpool : SIMPLE_THREAD_POOL = struct
type pool = int ref * (unit -> unit) Queue.t * Mutex.t * Condition.t
exception No_workers
let dowork tp = ...
let create size = ...
let addwork f tp = ...
let rec done_wait tp n = ...
let destroy tp = ...
end
If the counter is positive, it indicates the number of worker threads that the thread pool was created with. If the counter is non-positive, it indicates that the thread pool is being destroyed, and the absolute value of the counter is the number of threads that have properly exited. This allows the destroy function to wait for the threads to finish their work and exit before returning.
Each worker thread runs the function dowork
. This
function is not exposed in the interface, so it can only be called
from inside the implementation
of Tpool
. The function dowork
loops as long
as the
thread pool has not finished its work. When the work is finished, it
exits.
A thread pool is finished when it is being destroyed and there is no
work remaining
to do. We use the counter in the 4-tuple, here
called nworkers
, to indicate that the thread pool is
being destroyed by setting its value to something less than 1. In
that case, if the queue of work is also empty, then the thread exits as
the pool is finished. Otherwise, on each loop the worker waits for
work to do, and then takes that work from the queue, executing it
inside a try to ensure that unhandled exceptions in the
work do not cause the worker to exit.
let dowork (tp : pool) =
match tp with (nworkers, q, m, c) ->
Mutex.lock m;
(* When nworkers <= 0, the thread pool is being
* destroyed. If that is true and there is also no work left to do
* then stop looping and drop through to exit processing. *)
while (!nworkers > 0) || (Queue.length q > 0) do
(* If nworkers > 0, wait for stuff in the queue. *)
while (!nworkers > 0) && (Queue.length q = 0) do
dbgprint "waiting";
Condition.wait c m
done;
(* Verify something in the queue rather than we are now being
* shut down, then get the work from the queue, unlock the
* mutex, do the work and relock the mutex before looping
* back. *)
if Queue.length q > 0
then
let f = Queue.take q in
dbgprint "starting work";
Mutex.unlock m;
(* Don't let an exception in the work, f, kill the thread,
* just catch it and go on. *)
(try ignore (f()) with _ -> ());
Mutex.lock m
done;
(* A worker thread exits when the pool is being shut down. It
* decrements the worker count, which should be -n when all
* threads are finished, where n was the number of threads in
* the pool (counts down from 0). *)
decr nworkers;
dbgprint "exiting";
Mutex.unlock m
Note the use of the mutex in the 4-tuple, called m
here, to protect accesses to the queue and the counter. Before
entering the while loop, and at the end of the while loop (thus before
the next iteration of the loop), the mutex must be locked because both
the counter and the queue are accessed. The nested while loop
checking for work to do uses Condition.wait
to release
the mutex and sleep until it receives the condition of work being
ready. Recall that this reacquires the mutex before returning. It is
important that the mutex is then unlocked before
calling f
, the work to be done. This allows other
workers to safely run concurrently, as f
cannot access
the queue or counter. Then the mutex is reacquired before the end of
the while loop. When exiting the while loop, the mutex is already
locked, so the counter is simply decremented and then the mutex is
released before the worker exits.
The create
function makes a 4-tuple and starts up the
specified number of threads, each of which runs
the dowork
function. The create
function
simply acquires
the mutex at the beginning and releases it at the end. There are
other possible ways of
writing this code in which the mutex is not held during the
entire process of creating the thread pool.
let create size =
if size < 1
then failwith "Tpool create needs at least one thread"
else let tp =
(ref 0, Queue.create(), Mutex.create(), Condition.create()) in
match tp with (nworkers, _, m, _) ->
Mutex.lock m;
while !nworkers < size do
ignore (Thread.create dowork tp);
incr nworkers
done;
Mutex.unlock m;
tp
The add_work
function adds the given function to the
queue. To do so, it first locks the mutex and checks whether the
pool is being destroyed. If the pools is being destroyed, instead of
adding the work it raises the No_workers
exception after
first releasing the mutex. If the pool is not being destroyed, it adds
the work to the queue, signals that there is work to be done, and
unlocks
the mutex.
let addwork (f : unit -> unit) (tp : pool) =
match tp with (nworkers, q, m, c) ->
Mutex.lock m;
if !nworkers < 1
then (Mutex.unlock m; raise No_workers)
else
(Queue.add f q;
Condition.signal c;
Mutex.unlock m)
The destroy
function acquires the mutex, then sets
the number of workers to zero to indicate that the thread pool is
being shut down. It then broadcasts to all the workers that there is
something to be done to ensure that all the workers exit, including
ones that were currently sleeping while awaiting work. Finally, it
releases
the mutex, then waits for all the workers to exit using the helper
function done_wait
.
let destroy (tp : pool) =
match tp with (nworkers, _, m, c) ->
Mutex.lock m;
let n = !nworkers in
nworkers := 0;
Condition.broadcast c;
Mutex.unlock m;
done_wait tp n