Programming Project #2

CMSC 421, Fall 2001

Assigned: 8 Nov 2001
Due: Dec 5 2001 at 11:59 PM

Goals

With project 1 under your belt, you should now be comfortable modifying linux code in general, and adding system calls to linux in particular. This document will describes new functionality that we want you to add to the linux filesystem.

Most present day filesystems store the raw data directly on disk. This means that system administrators can see any data you store. In addition, the security of your data is tied to the security of the system as a whole. If miscreants can hack into the system as superuser, or can defeat the protection mechanisms of the OS, or physically steal the disk, then your data is compromised. One way to avoid this is to store the data on the disk in an encrypted format, with the decryption possible only with a key that you posses. This project asks you to create such an encrypted filesystem by layering the encryption/decryption process on top of the existing linux filesystems.

Mechanics, and what to hand in

The project can be done in groups of up to two people, although you are welcome to work alone. If a group works on a project, then in general we will assign a common score to both participants. Please make sure that the group is identified in the README you turn in, and that only one member of the group submits the project!

The submission instructions are the same as for the first assignment. You will turn in a patch file to the kernel, your test programs, any other changes you made, and a README describing your system. Please remember that the patch file should be generated after you remove any object files by using a make clean or make mrproper. Also, make sure you use the right options to diff. Remember that in addition to -rc options, you need to use -P if you have added any new files to the kernel distribution (which you almost certainly will in this case).

In addition, you will turn in a plain text file reporting your test, especially measurements of performance. We suggest you measure the time taken to read/write files of different sizes with and without encryption to figure out how much overhead the encryption process causes.

Your design documentation is due by 11:59 PM on 16 Nov 2001. We are enforcing this deadline to ensure that people don't leave the project until the last minute. You are, of course, welcome to visit either the faculty or TA office hours for help; however, one of the first things we'll ask for is your design documentation (unless you're asking for help with that...). You may make changes to your documentation before the full Project handin; however, the design portion of your grade will depend heavily on the design document you handin on November 16th. We will review/grade and return this to you within a week.

Your design documentation, typically 3-5 pages for a project of this size, should include the basic design of your software (what modules will you write, what is there functionality, where will you make changes to the kernel etc.), a timeline, as well as details on the testing that you plan to do to ensure that your code works.

The assignment name to use with submit for the documentation is p2doc, and for the project code is p2code.

Specifics

We ask you to implement several new system calls

Helpful Hints

Grading the Project

We suggest that you do the project in two phases. First, just add in the new functions without doing any complex encryption. Use something simple -- we suggest a substitution cipher, with the key indicating a shift. So a key of 4 will mean that A becomes E, B becomes F, ... Z becomes D and so on. Successfully completing this phase will entitle you to 75% of the implementation credit. The remaining 25% will come from using a "real" crypto algorithm such as Blowfish or 3DES.

There are several extra credit opportunities available, with the extra credit varying from 5 to 25 percent of the total. For a small amount of extra credit, encrypt not just the contents but even the names of the files. For greater extra credit, integrate your encryption with NFS or AFS; or allow the user to specify not just the encryption key but also the encryption algorithm on a per file basis. These are just examples -- you can discuss any other ideas you have Joshi to see if they would be suitable for extra credit. The intent of the grading for the project is not to differentiate among those students who do a careful design and implementation of the assignments. Rather, the grading helps us identify those students who (i) don't do the assignments or (ii) don't think carefully about the design, and therefore end up with a messy and over-complicated solution. Remember that you can't pass this course without at least making a serious attempt at each of the assignments. Further, the grading is skewed so that you will get substantial credit, even if your implementation doesn't completely work, provided your design is logical and easy to understand. This means that you should first strive to come up with a clean design of your project on paper. Second, don't try to add fancy features because some other group is!

The grading for the project will be as follows: 40% design, 50% implementation, 10% testing. We have structured the grading in this way to encourage you to think through your solution before you start coding, and realize that testing your implementation is an important part of any software development process. If all you do is to work out a detailed design for what you would do to address the assignment (and if the design would work!), but you write no code, you will still get almost half of the credit for the assignment. Conversely, if you implement correctly, but do not prove that by testing your code, you will still not be given complete credit. Tests should convince us of two things -- firstly that your implementation works and secondly how much overhead it adds to the file operations.

The implementation portion of the grade considers whether you implemented your design and provided documentation that the TA could understand. Part of being a good computer scientist is coming up with simple designs and easy to understand code; a solution which works isn't necessarily the best that you can do. Thus, part of the design and implementation grade will be based on whether your solution is elegant, simple, and easy to understand.

Rules for Collaboration

It is Ok for you to discuss general approaches with other groups. It is NOT OK to exchange solutions -- ideas or code. Please recall that academic dishonesty will be sternly dealt with.