The advantage of virtual memory is that processes can be using more memory than exists in the machine; when memory is accessed that is not present (a page fault), it must be paged in (sometimes referred to as being "swapped in", although some people reserve "swapped in to refer to bringing in an entire address space).
Swapping in pages is very expensive (it requires using the disk), so we'd like to avoid page faults as much as possible. The algorithm that we use to choose which pages to evict to make space for the new page can have a large impact on the number of page faults that occur. We discuss a number of these algorithms in this lecture.
When we need to evict a page, choose one randomly
When we need to evict a page, choose the first one that was paged in. This can be easily implemented by treating the frames as a circular buffer and storing a single head pointer. On eviction, replace the head, and then advance it. It will always point to the first-in page.
FIFO is also susceptible to "Belady's anomaly": it is possible that adding more frames can actually make performance worse! For example, consider the trace
Step | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Access | 1 | 2 | 3 | 4 | 1 | 2 | 5 | 1 | 2 | 3 | 4 | 5 |
Using FIFO with three frames, we incur 9 page faults (work this out!). With four frames, we incur 10 faults! It would be nice if buying more RAM gave us better performance.
When we need to evict a page, evict the page that will be unused for the longest time in the future.
Example: on the same trace as above with three frames:
Step | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Access | 1 | 2 | 3 | 4 | 1 | 2 | 5 | 1 | 2 | 3 | 4 | 5 |
Initially, memory is empty:
frame: | 0 | 1 | 2 |
page: |
In the first three steps, we incur three page faults and load pages 1, 2, and 3
frame: | 0 | 1 | 2 |
page: | 1 | 2 | 3 |
In step 4, we access page 4, incurring a page fault. Page 1 is used in step 5, page 2 is used in step 6, but page 3 is not used until step 10, so we evict page 3.
frame: | 0 | 1 | 2 |
page: | 1 | 2 | 4 |
Steps 5 and 6 do not incur page faults. In step 7, we need to evict a page. Page 1 is used in step 8, page 2 is used in step 9, but page 4 isn't needed until step 11, so we evict page 4.
frame: | 0 | 1 | 2 |
page: | 1 | 2 | 5 |
Steps 8 and 9 do not incur page faults, but step 10 does. Again, we consider the future uses of the data in memory; neither page 1 nor 2 will be used in the future, so we can evict either.
frame: | 0 | 1 | 2 |
page: | 3 | 2 | 5 |
Similarly in step 11, we can evict either page 3 or page 2:
frame: | 0 | 1 | 2 |
page: | 3 | 4 | 5 |
Finally, we execute step 12, which incurs no page fault. This gives a total of 7 page faults. This is guaranteed to be the optimal number.
Although we cannot predict the future, we can estimate it based on past behavior. Most programs exhibit both spatial and temporal locality:
Spatial locality: after accessing an address, it is likely that nearby addresses will also be accessed.
Temporal locality: pages that are accessed frequently are likely to be accessed in the near future
Exploiting these assumptions leads to the following algorithms: - Least frequently used (LFU) assumes that pages that have been accessed rarely are unlikely to be accessed again. Keep a count of how many times each page is accessed, evict the page with the lowest count - Least recently used assumes that pages that were accessed recently are likely to be needed. Keep a timestamp of latest access, evict the page with the lowest timestamp. - Most recently used assumes that programs do not read the same addresses multiple times. For example, a media player will read a byte and then move on, never to read it again. As with LRU, keep a timestamp of latest access, but evict the page with the highest timestamp.
These algorithms exploit locality to approximate OPT, and thus can often do a good job of reducing page faults. However, implementing them is very difficult: - a count or timestamp needs to be updated on every access; this requires hardware support, and an extra register per TLB entry (expensive!) - there is one count/timestamp per page; to find the process to evict, we have to traverse the entire frame list.
Instead of finding the least recently used page, we can simply find a page that was not "recently used" for some definition of "recently used". This requires only a bit per page, and makes finding a candidate to evict easy (since there are many we could choose).
To support these approximations, many TLBs support an additional "use bit" that is set automatically whenever a page is accessed.
The second chance algorithm (which some people call the clock algorithm) works just like FIFO, but it skips over any pages with the use bit set (and clears the use bit).
Example: Let's consider the same trace as above with the clock algorithm, for the first few steps:
Step | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Access | 1 | 2 | 3 | 4 | 1 | 2 | 5 | 1 | 2 | 3 | 4 | 5 |
Initially, memory is empty:
frame: | 0 | 1 | 2 |
page: | |||
use: | |||
next: | ^ |
In the first three steps, we incur four page faults and load pages 1, 2, and 3, advaning the next pointer. The use bit is set (since we're using the pages).
frame: | 0 | 1 | 2 |
page: | 1 | 2 | 3 |
use: | 1 | 1 | 1 |
next: | ^ |
In step 4, we incur a page fault. We look for an unused page, clearing bits as we go:
frame: | 0 | 1 | 2 |
page: | 1 | 2 | 3 |
use: | 0 | 1 | 1 |
next: | ^ |
frame: | 0 | 1 | 2 |
page: | 1 | 2 | 3 |
use: | 0 | 0 | 1 |
next: | ^ |
frame: | 0 | 1 | 2 |
page: | 1 | 2 | 3 |
use: | 0 | 0 | 0 |
next: | ^ |
Once we find one, we evict it:
frame: | 0 | 1 | 2 |
page: | 4 | 2 | 3 |
use: | 1 | 0 | 0 |
next: | ^ |
Step 5 is also a page fault; again we look for an unused page starting from the next pointer. In this case frame 1's use bit is clear, so we evict page 2.
frame: | 0 | 1 | 2 |
page: | 4 | 5 | 3 |
use: | 1 | 1 | 0 |
next: | ^ |
On step 6, we again have a page fault; we evict page 3 from frame 2.
frame: | 0 | 1 | 2 |
page: | 4 | 5 | 2 |
use: | 1 | 1 | 1 |
next: | ^ |
The rest is left as an exercise: in the end, we will have incurred 10 total page faults and end in the following state:
Step | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Access | 1 | 2 | 3 | 4 | 1 | 2 | 5 | 1 | 2 | 3 | 4 | 5 |
frame: | 0 | 1 | 2 |
page: | 5 | 3 | 4 |
use: | 1 | 0 | 0 |
next: | ^ |
Second chance resets the use bit when a page is considered for eviction. Depending on the locality and size of memory, we can end up in a state where almost every use bit is set (so that most accesses will cause us to loop over a large number of candidates) or almost every bit is clear (so that we degenerate to FIFO).
To solve this, we can "decouple" eviction from the clearing of the use bit. We can have two "hands" that traverse the frames:
the "eviction hand" is just like the next pointer in the second chance algorithm: to evict a page, we advance the eviction hand until we find a page with the use bit cleared. We evict that page. Unlike second-chance, we do not clear use bits.
the "clearing hand" is periodically moved forward, and all of the use bits it passes are cleared.
The clearing hand should lead the eviction hand; the distance between them defines "recent". A page's use bit will be set if it has been accessed more recently then when the clearing hand passed.
The distance between the hands should be kept constant. If the distance is too long, the clock algorithm approximates second chance, so we may have to examine many entries before finding an unused one. If the distance is too short, then pages rarely get a chance to get used after the bit is cleared but before the eviction hand passes, so the clock algorithm devolves into FIFO.
The "correct" distance depends on the workload, and is a measure of how much locality the processes have. A longer distance means more history is used; the usefulness of the history depends on how much locality the programs have.
Thrashing occurs when processes are actively using more memory than is physically present. This causes a state of continuous paging; processes run for a short time, immediately try to page in some data, causing another process to run, which itself pages in data, and so forth.
Thrashing can be easily addressed: when the system starts thrashing, choose a process or set of processes, and either kill or suspend them (suspension may be sufficient if processes only need lots of memory for a short time; they can be finished one by one). A good choice for termination would be the process that is using the most memory.
The only difficulty is in estimating the amount of memory that is currently being used. Because processes share physical memory using a heuristic replacement algorithm instead of acquiring and releasing it, the total amount "in use" is a bit of a fuzzy concept.
A good approximation is to track the working set of each process: the set of pages that have been accessed with in the past n time units. The size of the working set gives a rough idea of how much memory the process is actively using.
By tracking working sets, we can detect thrashing by comparing the total working set size to the number of frames of available memory. We can also make use of the size of the working sets when determining which processes to suspend.