PortOS Project 6

File System

Overview

For the final project, you will implement a virtual file system to work with your minithreads package. We have provided a block-based disk interface in disk.h, which simulates a disk by translating block reads and writes to accesses to a single Windows NT file.

You should implement a hierarchical, Unix-like file system on top of the disk emulator that supports the following operations:

minifile_t minifile_creat(char *filename)
minifile_t minifile_open(char *filename, char *mode)
int minifile_read(minifile_t file, char *data, int maxlen)
int minifile_write(minifile_t file, char *data, int len)
int minifile_close(minifile_t)
int minifile_unlink(char *filename) (delete file)
int minifile_mkdir(char *dirname)
int minifile_rmdir(char *dirname)
int minifile_stat(char *path) 
int minifile_cd(char *path) 
char **minifile_ls(char *path) 
char *minifile_pwd()

To get an idea of what these functions should do, look them up (omitting the "minifile_" prefix) via man on a Unix system, or in the Visual Studio help. The only exception is that our minifile_open takes arguments like the fopen call, instead of open. Don't worry about reporting detailed error codes when something goes wrong, returning -1 from the function (or some other appropriate error value) is good enough. Your file system should support variable-sized files via a Unix-like inode mechanism, and reuse of blocks from unlinked (deleted) files. It is vital that your file system has concurrency control, so that it can cope with simultaneous accesses by multiple threads.

If you stick to the interface above, you should be able to compile and run the shell program included with this version of the code. You should then be able to create files and directories from the command line.

The Details

The disk simulator is relatively straightforward. To create a disk, use the disk_create() function, provide a name for your disk. You can also specify some disk flags to control disk behavior, and give a maximum size for the disk. Use disk_startup() to spin up the disk you have created.

To begin using the disk, just issue disk requests through the disk_send_request() function. The format of the requests is shown in disk.h. When (and if) requests complete, the disk controller signals the completion by raising an interrupt. As with previous assignments, you need to write an interrupt handler that will handle these interrupts appropriately.

Recall from the course discussion that the disk controller may reorder your requests in any arbitrary order. In fact, an efficient controller will reorder requests quite aggressively. Consequently, if you have a series of blocks that need to be written with a well-defined order (e.g. block A before B before C), then you must, in your file-system code, make sure that you do not issue request B before the request for A has been completed.

Make sure to test your code extensively. Simple sequential tasks, such as creating files, creating directories, removing directories, etc. should be easy. But you should also test your code with concurrent accesses as well as failure cases, e.g. five threads are concurrently writing to the disk when a system crash occurs (someone presses control-c). The file-system should not be left in an inconsistent state. You can set the failure rates to non-zero values to have the disk controller experience such errors occasionally, just like a real disk.

Note that not all of the functions are file operations. For instance, the concept of the "current working directory" is not a global abstraction that applies to the file-system, but a piece of local state kept with each process (minithread). Similarly, minifile_pwd() returns the path to the current working directory associated with the calling thread. Your file-system should NOT have any such state as global variables shared across independent processes.

Since you are not asked to support mount points, there should be only one file-system in your implementation. This unique file-system should reside on a virtual disk that uses the file MINIFILESYSTEM in the current directory. Make sure you write a C program, called mkfs.exe, that creates an empty file-system in this virtual disk. This initial file-system created by mkfs.exe should contain only one directory (the root directory) with no entries.

Applications may access the filesystem concurrently. There are a few reasonable approaches to how your FS can deal with open/read/write/close concurrency:

Approach 1 (Unix semantics): You allow multiple applications to open the same file for writing concurrently, and to issue concurrent write requests to the same file. You invoke the end-to-end argument and say that multiple applications performing concurrent accesses to the disk need to coordinate among themselves if they want to retain application-level guarantees of atomicity, and that the FS is not the appropriate subsystem in which to enforce synchronization constraints. Therefore, your FS can leave aside any guarantee of integrity of file contents when the file is accessed concurrently. Specifically, if thread A writes blocks 0 with data A0 and block 1 with data A1, and thread B writes the same blocks with data B0 and B1, the file may end up with "A0, A1", or "A0, B1" or "B0, A1" or "B0, B1", all depending on who performed the last write.
This approach is quite simple to implement. But if you choose this route, you must ensure that your filesystem guarantees the integrity of the filesystem. For instance, concurrent accesses should not cause your FS to generate orphaned data blocks, for example (as would happen when a naive write() implementation that overwrites an existing inode is used concurrently by multiple threads).
Approach 2: You implement multiple-readers/single-writer concurrency semantics at the data block level. Multiple readers and writers may open a common file. The "open for write" function should not cause the calling application to block the calling application, i.e. there may be multiple applications with writable file handles to the same file. The readers/writers lock implementation is enforced at the granularity of individual read and write calls. Multiple writers are not allowed to issue concurrent disk requests, while multiple readers are allowed. If you use this approach, and we have the previous example with threads A and B, each of which write two blocks to the disk with a single call to write(), the file can only contain "A0, A1" or "B0, B1". Readers/writers thus provide multi-block atomicity guarantees compared to the previous approach.
Approach 3 (Windows semantics): Your FS does not permit a file to be opened for writing while another user has it open for reading or writing. This is quite restrictive. Applications may hold files open for long periods of time, which keep other applications from making progress.
Unfortunately, a misunderstanding led some groups to repeat Windows' mistakes and implement this approach. As a result, we will not penalize this approach this semester (Spring 2003), but you should realize that the utility of such a filesystem is more limited than those of the previous two approaches. Windows got it wrong, and the inability to open a file for writing just because some other application has it open is a needless, infuriating limitation.

Similarly, your FS must define a reasonable set of semantics when multiple applications concurrently access the file and delete or rename it at the same time. For example, suppose a thread is writing to a file when another thread unlinks that file. Once again, you have a few implementation options, with different degrees of desirability:

Windows semantics: The delete fails. This is an annoyance for millions of users worldwide.
Unix semantics: The file is made unreachable (i.e. it is taken out of the directory hierarchy) at the time the unlink is issued. However, the inode and the file's data blocks are not placed back on the free list until all applications that have an open file descriptor to that file close their descriptors. This permits applications to continue working on a file which has been deleted, and permits concurrent applications to make forward progress. It can be implemented with a simple reference count of open handles to each file. Make sure that the last person who closes the deleted file places the (now-deleted) blocks back on the free list. Any changes made to the file are lost if the file got deleted while it was being modified.

We will favor Unix semantics in our grading.

It goes without saying that a system crash in response to concurrent operations must be avoided at all costs.

General guidelines for file system design:

Disk organization: Use the first block from the disk as the superblock followed by the disk blocks that contain the inodes (about 10 percent of the disk) and the rest for data blocks.
Superblock: The superblock contains global information about the file system. You can store in the superblock things like the first free inode (if the free inodes are organized in a linked list) the first free data block (more about this latter), the pointer to the root inode (the entry point in the file-system), the total number of inodes and blocks, the overall size of the file-system and any other thing you consider useful. Also in the first 4 bytes put a predefined number (called magic number) that helps you determine if you have a legitimate file system on a disk or not.
INODES Inodes should contain information about the file or directory they represent. Any relevant information about the file except the name should be kept in it's inode. This might include: type (file or directory), size, NEXT_FREE_INODE (to maintain the list of free inodes), etc. Also the inode contains information about which of the data blocks are used by the file. You have to be able to address 11 direct blocks and one indirect block (that contains pointers to other disk blocks). You can find more information about direct and indirect blocks in your textbook.
Directories Directories are slightly special files that contain the mapping between file or directory names and inode numbers (or pointers). The only differences between a directory and a file are: directories cannot be deleted with unlink and can be deleted only if they are empty, they have a fixed format (for example you can use an ASCII representation with the inode number and the file on the same line and separated by a tab or a binary representation; just make sure the file name can be at least 256 characters) and they have the type set to DIRECTORY. You can perform linear search to find the inode number that corresponds to a particular file.
Paths: Paths have the general form /dir1/dir2/.../dirn/filename. The first / means the root of the file system.

How to Get Started

Make a backup copy of your code from the previous project, download the code from here portos6.zip, and merge.

We've included a simple command shell in shell.c, which you can link against your minifile implementation. It should enable you to test your code from the command line.

Submissions

Consult the submission guidelines to find out how to submit your work.

For the Adventurous

Note: These suggestions for an extra challenge will be examined but not graded. They will have no impact on the class grades. They are here to provide some direction to those who finish their assignments early and are looking for a way to impress friends and family.

Implement hard links to files. The reason why the Unix delete operation is called "unlink" is because the use of inodes for storing file information, separate from the directory hierarchy, allows a file to have multiple names, even within the same directory. Every "name" simply points to the one inode for the file. To add an additional directory entry for a file (i.e. give it another name), the "link" system call is used. Unlink is the opposite of link: it removes a name. The implementation is complicated by the fact that you need to keep track of how many links exist to a file, so that you know when to remove it completely.

Final Word

If you need help with any part of the assignment, we are here to help. You may also find the FAQ useful.

Emin Gün Sirer, May 2002