Due March 31.
This is the first half of the semester project, in which you will develop a basic but complete implementation of your Sparse Petabyte File System (SPBFS). This phase has several steps, for which I provide nominal due dates—you should consider these due dates genuine, because they keep you on track. March 31, however, is the date when you are expected to submit your working implementation up to the requirements described below.
When submission is closed, I will release a working reference implementation. You may then proceed with the rest of the semester project based on your own code or the reference implementation, whichever you find easiest. Obviously, you will understand your own code the best, but might find it easier not to have to worry about your own bugs. You must turn in a complete original submission, no matter how buggy, to be allowed to switch to the reference implementation.
Regarding threading: FUSE can operate in two modes. In multi-threaded mode, the default, FUSE will spawn a pool of worker threads to handle incoming requests. It will do this totally transparently, on your behalf; the only thing you need to know is that your functions must be re-entrant. In order to avoid this concern for the first half of the project, be sure to run your server in the other, single-threaded mode. You do this by passing your server the -s option. For instance, instead of mounting your server with ./server mnt, run ./server -s mnt.
Your server must:
Although the system is due completely on March 31, I suggest breaking it down into the following checkpoints. I will check up informally in class how many people are on track for this plan.
You might want to take these in a different order. Specifically, you should start with the goals that are likeliest to give you trouble or fail, so that you can get help early. Thus, if you are most worried about the indexing structure (which is, indeed, the logically interesting portion of the assignment), you might want to get to it before worrying about adding directories; if dealing with FUSE and the complexities of filesystem structure worry you more, you should start there. Order them in such a way that the greatest risk is resolved earliest, while leaving a system such that the rest can be appended easily.
Do not concern yourself with efficiency. The point of this phase of the semester-long project is to get a complete, correct, stable implementation. The entire second half of the project will be devoted to tweaking this project with multithreading, synchronization, more advanced indexing, profiling, and optimization. Do not bother with any clever tree rebalancing, resource reclamation, or performance tricks. If you do, not only will this portion of the project be harder for you, you might start down a wrong path that will hinder you in the second half. Correctness comes first, then profiling to understand bottlenecks, and only then comes optimization. Only the first part concerns you so far.
Remember to run your code single-threaded, with the -s option when running your server. Otherwise, you might encounter race conditions when manipulating your indexing structure that will confuse and derail you indefinitely.
Although this project frequently makes reference to petabytes, you do not actually need to worry about an upper limit on the size of the files. The offsets you are passed in to work with are of type off_t, which is the largest size the system itself can handle; make sure that you can support this entire range. On a 64-bit machine, you will indeed support petabytes, but you do not need to get caught up in the details of the actual upper limit on any particular system.