Spin locks

Milk problem solution
Spin locks
- key terms: test and set, compare and swap, spin lock
Board image
Spring slides

Solution to the milk problem

The following solution is safe, live and fair. Note that it can be generalized to multiple threads, but it is not obvious how to do so.

a correct solution
Shared state: has_milk = False working_1 = False working_2 = False turn = 0
Thread one code: 1: working_1 = True 2: turn = 2 3: while working_2 and turn == 2: 4: do nothing 5: if not has_milk: 6: buy milk 7: has_milk = True 8: working_1 = False	Thread two code: (symmetric) 11: working_2 = True 12: turn = 1 13: while working_1 and turn == 1: 14: do nothing 15: if not has_milk: 16: buy milk 17: has_milk = True 18: working_2 = False

a correct solution

Shared state:

has_milk  = False
working_1 = False
working_2 = False
turn      = 0

Thread one code:

 1: working_1 = True
 2: turn = 2
 3: while working_2 and turn == 2:
 4:   do nothing
 5: if not has_milk:
 6:   buy milk
 7:   has_milk = True
 8: working_1 = False

Thread two code: (symmetric)

11: working_2 = True
12: turn = 1
13: while working_1 and turn == 1:
14:   do nothing
15: if not has_milk:
16:   buy milk
17:   has_milk = True
18: working_2 = False

The idea behind this code is that neither can take control from the other, they can only yield control to the other.

This code is safe, live, and fair, although the argument is rather complicated:

safety: clearly, by the time either thread finishes, milk will have been bought at least once. However, we must show that it is bought at most once.
Suppose otherwise, that is, that both lines 6 and 16 are executed. This implies that thread one must have been on lines 5-7 at the same time that thread two was on lines 15-17. One of the two threads must have exited the while loop first. Without loss of generality, assume it was thread one. When it exited the loop on line 3, one of two things was true:
- working_2 was false. In this case thread two has not executed line 11 yet. It will be impossible for thread two to proceed past line 14 until thread one reaches line 18, because by the time it reaches line 14, turn will be 1 and working_1 will be true.
- working_2 was true but turn == 1. In this case thread two must have executed line 12 after thread one executed line 2. This means that turn can never become 2. Thus the only way that thread two can escape the loop on line 13 is if working_1 becomes false, which only happens after thread one completes line 8.
liveness: the only place that the threads can get stuck is in the spin loops on lines 3 and 13. However, both threads cannot be stuck simultaneously, because turn cannot be both 1 and 2. Once one of the threads proceeds past the spin lock, it will eventually set its working variable to false, which will allow the other thread to exit from the spin loop
fairness: the code is completely symmetric, and thus fair.

Spin locks

Although this solution is correct, it is difficult to write, and even harder to reason about. This is inherently harder than writing sequential code, because instead of considering a single path of execution, there are an exponential number of paths to consider (exponential in the length of the code: roughly speaking, for each instruction, either of the two threads could execute next, so there are 2^length possible sequences of operations).

A small amount of hardware support can help considerably. By atomically reading and writing an address in memory (without any other processor changing the state in between), we can write fairly simple locking code:

We discussed two common hardware primitives for this task:

Test and Set

The test_and_set instruction (TAS) sets the contents of a given address to one, and returns the previous value. It can be used to implement a critical section by ensuring that the contents of the address are 1 if and only if a thread is executing within the critical section:

critical section with test and set
Shared state: lock = False
Thread one code: 1: while test_and_set(lock): 2: do nothing 3: # critical section 4: lock = False	Thread two code: (same) 5: while test_and_set(lock): 6: do nothing 7: # critical section 8: lock = False

If a thread starts executing this code, the state of the lock will be false. The test_and_set instruction will simultaneously set the lock to true and return the value false. Since it returns false, the while loop does not execute, and the thread enters the critical section.

If another thread tries to enter the critical section, the lock will be set to true, so test_and_set will still set the lock to True, but will return True as well (since it was True before the TAS). The second thread will continue to execute the while loop until the first thread executes line 4. After that, the second thread's subsequent call will return False, allowing it to enter the critical section.

The process of continually monitoring a variable to wait for it to change is referred to as spinning; locks that are implemented using atomic operations are called spin locks.

compare_and_swap

The compare and swap instruction (CAS) is similar to, but more complicated than, the test_and_set instruction. The CAS instruction takes three parameters: a location, an "expected value" for that location, and a new value for the location.

It checks that the contents of the location match the expected value. If so, it replaces them with the new value, but if not it has no effect. In any case, the previous value of the variable is returned.

This can be used to implement a more sophisticated spin lock that stores the thread identifier in the lock (instead of just true or false). The following code ensures that at most one thread can be in the critical section, and if there is a thread in the critical section, then the value of the lock variable is the thread's identifier (or 0 if there is no thread in the CS):

critical section with compare and swap
Shared state: owner = 0
Thread one code: 1: while compare_and_swap(owner, 0, thread 1): 2: do nothing 3: # critical section 4: # invariant: owner == 1 5: owner = 0	Thread two code: (symmetric) 5: while compare_and_swap(owner, 0, thread 2): 6: do nothing 7: # critical section 8: # invariant: owner == 2 9: owner = 0

critical section with compare and swap

Shared state:

  owner = 0

Thread one code:

1: while compare_and_swap(owner, 0, thread 1):
2:   do nothing
3: # critical section
4: # invariant: owner == 1
5: owner = 0

Thread two code: (symmetric)

5: while compare_and_swap(owner, 0, thread 2):
6:   do nothing
7: # critical section
8: # invariant: owner == 2
9: owner = 0

Using CAS for optimistic data structures

CAS's are nice because they can be used to implement optimistic transactional data structures. The idea behind an optimistic data structure is that all updates are performed on a copy of the data structure; when the operations are finished, a compare and swap is used to replace the data structure in one fell swoop. For example, we may want to write code for a concurrent balanced binary search tree. Operations that modify the tree (such as insertion and balancing) will create a new tree and update the root pointer.

optimistic concurrency with compare and swap
Shared state: root = pointer to the root of the tree
Insert code: do old_root = root new_root = new Tree # copy old_root into new_root # do insertion into new_root until compare_and_swap (root, old_root, new_root) == old_root
Balance code: do old_root = root new_root = balanced_copy_of (old_root) until compare_and_swap (root, old_root, new_root) == old_root

optimistic concurrency with compare and swap

Shared state:

  root = pointer to the root of the tree

Insert code:

do
  old_root = root
  new_root = new Tree
  # copy old_root into new_root
  # do insertion into new_root
until compare_and_swap (root, old_root, new_root) == old_root

Balance code:

do
  old_root = root
  new_root = balanced_copy_of (old_root)
until compare_and_swap (root, old_root, new_root) == old_root

If an insertion is performed while a balance is in progress, then it will update the root to point to its new root. When the balancing thread completes, the compare_and_swap will fail, because the root will point to the new root that the insertion produced and not the original root pointer. The loop will then be repeated, and the new tree will be balanced instead.

Similarly, if the balance finishes before the insertion, then the CAS in the insertion code will fail (again, because root points to a different node than old_root), and the insertion will be retried on the new (balanced) root.

Semaphores

Semaphore is a data structure that encapsulates an integer. From the user's perspective, the integer is never allowed to become negative; attempting to decrement will block the running thread until another thread increments the count.

Semaphores support the following interface: - initialize the semaphore to an initial value - V: increment the semaphore, also called release, or signal. - P: block until the semaphore has a positive value, then decrement it. also called acquire or wait.

Some semaphore implementations allow you to perform other operations. You should avoid using anything other than P and V. For example, python provides the ability to acquire without blocking; other libraries provide the ability to read the internal value of a semaphore. Using these operations can easily lead you to write buggy code. Stick to P and V.