Jian Huang at University of Tennessee; CS361 Operating System and Systems Programming

Demand Paging

For a process to execute, it needs to be in memory, thankfully, not all pages need to reside in core. Through demand paging, pages are loaded only as needed. Pages that are never used will never be loaded into physical memory, those pages will reside on disk.

We need to clarify on a couple of terms. In many ways, paging is similar to swapping, but swapping deals with a process as a whole, paging is page specific. So the term swapper does not apply, we instead use the term pager. When a process is swapped in, the pager guesses which pages would be needed and only load those pages. Page table of the process maintains info about all pages, and used Valid bit to indicate whether a page has been loaded by pager. Invalid could have two meanings - the page is not in memory or it's simply invalid. Pages in memory are "memory resident". Page fault is when a process access a page that is not memory resident, specifically this means accessing a page marked as invalid in page table. Page fault may occur at any memory reference - instruction fetch can cause page fault, so could fetching an operand. Page fault leads to a trap into OS kernel where the page fault is handled in the following steps.

check PCB for whether the page is really invalid or simply not memory resident
if invalid, terminate the process. if simply not memory resident, bring in the page.
find a free frame.
schedule a disk operation to read the desired page
after read is complete, we modify the internal table in PCB and the page table
restart the instruction interrupted by the trap

Pager needs and has hardware support, i.e. not simply some software instructs executing in kernel space. Demanding paging adds a lot of complexity into the computer architecture. Pages that are not presently loaded in memory are kept on disk, not in your home directories though. They are kept in the swap space on the swap device.

Now, let's talk a bit about the cost incurred during a page fault. When a page is memory resident, memory access time typically ranges from 10 to 200 nanoseconds on modern computers. With page fault, the cost is much higher and can be broken up into three parts: service the page-fault (100s of instructions), read in (I/O) the page (typically 3ms disk latency, 5ms seek time and 10s of microseconds of data transfer), and context-switching back to the process (a wait in the queue and then 100s of instructions). The majority of the cost is then roughly 8 ms, almost entirely spent in disk I/O.

Now that we know the rough breakdown, and 200 nano seconds vs. 8 milliseconds, we can roughly estimate - to keep average overhead due to page faults to 10% of the total memory access cost, you should write code that can go 400,000 memory accesses without incurring a page fault. Do you know how to do that?

Page Replacement

Page fault is a routine event due to a simple fact - our memory is over-allocated due to the increase in the degree of multi-processing. Even if one process gets only 20 frames, 100 processes will require 2,000 frames. Swapping partially solve this problem, but again swapping is on a per process level. Page replacement addresses the problem on a per page level.

Page replacement is needed when a new page needs to be loaded but no frames are available. Page replacement starts with finding a victim frame to be written to disk and made available. In this regard, it is very much like a caching algorithm.

Some savings can be obtained by avoiding unnecessary page writes, which are very expensive, hardware would implement a dirty bit for each frame, to indicate whether the corresponding page has incurred any changes. Note - the subject here is "frame", not page.

There are many potential algorithms for page replacement. How do we choose? in the same manner when we chose CPU scheduling algorithms. The simplest one is FIFO, which unfortunately encounters Belady's anomaly. The optimal page replacement algorithm is also quite simple in concept - to replace the page that will not be used for the longest period of time. - You got it, this requires that we know about future; kinda hard to implement.

LRU is a practical algorithm that's free from Belady's anomaly. The page fault rate when you have N+1 frames is always lower than when you have only N frames. Let's also go over how to implement an LRU in class - this is not in the textbook.

Misc

Allocation of frames is also important. We should understand how many frames are needed at minimum. For example, on an architecture allowing "load" of one memory location and allow one level redirecting, then you need at least 3 pages in memory together. Can't do this with 2 frames and no incur a page fault. On a machine on which one instruction can have two operands of indirect references, this means needing at least 6 frames for one instruction (PDP-11).

Global replacement algorithms looks for victim frames glboally in the whole physical memory space. Local replacement methods only consider the process's own frames.

Thrashing - a process is spending more time in paging than executing.

CS361: Operating System

Jian Huang — Spring 2012

EECS | University of Tennessee - Knoxville

Demand Paging

Page Replacement

Misc