Inverted page tables and hashed paging
We started discussing page replacement; I've moved these notes to the next lecture so they can be in one place.
The size of the page table (hierarchical or otherwise) grows with the size of the virtual address space. If we have a large virtual address space (such as in a 64 bit architecture), the page table will become huge. Hierarchical paging will allow us to keep most of that out of main memory, but would require a 6-level hierarchy (why?). That means to look up an address you need to read at least 6 frame numbers, which is expensive.
Instead of making a very large sparse array, we can instead use an inverted page table (IPT) with one entry per frame. Each entry describes the page that is stored in the corresponding frame. This includes the process ID and the page number. We can also store permission bits in the IPT.
At first blush, this data structure doesn't seem very useful: the point of the page table is to find the frames that store pages, but to find a page in the IPT you need to already know the frame number!
To solve this problem, we restrict the set of frames that can hold a given page. Each page can be stored in one of a small number (say, 3) of frames, and those frames are determined by hashing the page number and process ID. For example, process 7 page 17 can be stored in frames numbered hash(7, 17, 1), hash(7, 17, 2), and hash(7, 17, 3).
To locate the page, we examine these three entries of the IPT. If the page is in one of the frames, we have found it; if not, it must not be in memory, so we select one of those three pages to evict.
It is possible (in fact necessary) that multiple pages will hash to the same locations. However, if we have a relatively small number of popular pages, it is quite unlikely that we get into a state where all of the popular pages are competing over a small set of frames: each popular page will have a different set of frames, and the overlap will be small.