Programming Projects: Linux Installation and Modifications

CMSC 421, Fall 2003

Project 2 Now Assigned: Nov 3, 2003;

Design Document Due: Nov. 23, 2003; Design Document Due Date Extended to Nov. 26th, 11.59PM

See Project Design Document Guidelines;
Note: If working in a group, BOTH design and final code should be submitted from the same GL account

Changes in Project 2 Requirements (For ALL Sections):


Final Code/Report Due: Dec. 8, 2003; Submission deadline extended to Dec. 9th, Tuesday, 11.59PM

Follow SUBMIT instructions completely and carefully (updated on Dec 8th, 2003).

Goals

The aim of this project is rather simple -- to make you work with a real operating system (in the present instance, Linux), and to understand how to modify and add functionality to it. The project will have three phases, with the final phase due the end of the semester (just after Thanksgiving break).

Groups:This project can be done in groups of at most TWO members (NO Exceptions). If working in a group, form your group no later than Sep. 18, 2003. Under rare circumstances, your group member may be a different section than what you are enrolled in. You have to seek both the sections' instructors' approval for this.

ITE 240 LAB:The ITE 240 Lab may be used by students enrolled in CMSC 421 for their project work. The lab contains 24 high-end Pentium machines with 1 GB RAM, dual-processor Pentium 4 CPUs and a CD-RW drive. The 3 TAs will be holding regular weekly Office Hours in the ITE 240 Lab - plan to take full advantage of these. The office hours are available on the course website and also posted outside the door.

ITE 240 LAB USAGE Policies:

What will you need

Since you will be installing and modifying your own version of the linux kernel, you will need to:

IMPORTANT: START EARLY. These are non-trivial tasks and if you do not start early, your chances of finishing on time keep on diminishing the later you start.

Basic Phase: Due Sep 28, 2003, 11.59 PM

This project is intended to help you gain experience with obtaining a linux kernel and installing it. Since most current releases of Linux use the 2.4 kernel series, we will be using the RedHat 8.0 Linux version (2.4.18). This available via FTP from UMBC's Linux Mirror Site at:

ftp://mirrors.umbc.edu/pub/linux/8.0/en/iso/i386/

Download the FIRST TWO ISO (disc1 and disc2) files and copy them on to the CD-Rs (Burn them in ISO format and it might be better to do this from under Windows).

The Linux kernel will be installed on the External USB Hard Drive or your own computer/laptop internal hard-drive (suitably partitioned). For the external USB hard drive option, you will need to create a boot floppy (so, do not throw away those floppy disks yet). If you plan to use the same external hard drive in both the lab and your personal computer, then you might need two separate boot floppy disks and also have two partitions/Linux installations on the external hard-drive: one each for the laboratory use and your PC use.

What to hand in

Submit via the UMBC submit program (Run this from a GL machine or one of the lab machines).


You should submit a SINGLE FILE that contains the following information:

The command to run is specific to your section: Note: If you are working in a group, then ONLY one student (from any one of the two sections) should submit. No credits are assigned for this phase -- this is simply catch up time for those in the class not yet familiar with installing, partitioning, dual booting etc. You may want to use the local Linux Users Group as a resource and perhaps join their installfest.

Helpful Links


Project 1, Due Oct. 26, 2003, 11.59 PM

Assigned: 30 Sep. 2003

We assume that you have installed the required kernel. This document will describe a new function that we want you to add to the kernel as a system call.

GOALS

To get the student comfortable with adding system calls to the kernel. At the end of this project you will be comfortable with perusing the Linux sources and modifying them. While the source code you produce will itself not be voluminous you will find that you will have to spend long hours looking at various source and .h files.

DESCRIPTION

We assume that you have installed the 2.4.18 kernel. This document will describe a new function that we want you to add to the kernel as a system call. The exercise is fairly straightforward, and you'll add in no more than 50 lines of codes/headers etc -- probably less. The idea is to make sure you understand the mechanics of modifying the kernel. We assume that you are already familiar with makefiles and debugging from classes such as CMSC 341. If not, this will be a considerably more difficult project because you will have to learn to use these tools as well.

Helpful Hints

WHAT TO HAND IN

There are two steps to what you will hand in - chronologically separated by one week:

1.      Design documentation - due on or before 11:59 PM on 19 Oct 2003.

2.      Source code and documentation - due on or before 11:59 PM on 26 Oct 2003.

Design documentation: We are enforcing this deadline to ensure that people don't leave the project until the last minute. You are, of course, welcome to visit either the faculty or TA office hours for help; however, one of the first things we'll ask for is your design documentation (unless you're asking for help with that...). You may make changes to your documentation before the due date for the source code (Oct 26th); however, the design portion of your grade will depend heavily on the design document you hand in on Oct 19th.

Your design documentation, typically 1-2 pages for a project of this size, should include the basic design of your software (what modules will you write, where will you make changes to the kernel etc.), a timeline, as well as details on the testing that you plan to do to ensure that your code works. The (section dependent) submit commands are:

Source code and documentation: You will need to hand in all of your code and documentation using the submit programs available on the GL cluster. In particular, hand in a SINGLE tar file (run man tar for information on 'tar' command) that contains the following items: These were written up after the due date of Project 1 and do not apply to the Fall 2003. This is for future!!

  1. The tar/gzipped version of a patch to your modified kernel
  2. The source code file(s) that implements the new system calls (this could be a separate file or part of an existing kernel file)
  3. The updated entry.S, unistd.h and kernel Makefile
  4. The driver program (any .c, .cc and .h files) and any Makefile used
  5. Sample Output showing the execution of your driver program (this can be obtained by running the 'script' command and submitting the file 'typescript' generated as output by script.
  6. A README file (explaining how to compile/run your program) - Include any "gotchas" with your program that you are aware of. If the programs wont compile correctly, be honest and state them here.
  7. A COMMENTS file (describing your experience with the project and any suggestions/feedback to the instructors).
For submit, the commands will be different by section as shown below:

 


Final Phase

Assigned: 3 Nov 2003
Design Document Due: Nov 23, 2003 at 11:59PM; Design Document Due Date Extended to Nov. 26th, 11.59PM
Design Document Guidelines
Final Code/Results Due: Dec 8, 2003 at 11:59 PM
; Submission deadline extended to Dec. 9th, Tuesday, 11.59PM

Goals

With project 1 completed, you should now be comfortable modifying linux code in general, and adding system calls to linux in particular. This document describes new functionality that you will add to the linux filesystem.

Most present day filesystems store the raw data directly on disk. This means that system administrators can see any data you store. In addition, the security of your data is tied to the security of the system as a whole. If miscreants can hack into the system as superuser, or can defeat the protection mechanisms of the OS, or physically steal the disk, then your data is compromised. One way to avoid this is to store the data on the disk in an encrypted format, with the decryption possible only with a key that you posses. This project asks you to create such an encrypted filesystem by layering the encryption/decryption process on top of the existing linux filesystems.

Specifics

The above can be compared to Netscape's (or other browsers') Password Manager function that lets you store login/passwords for various different sites and uses a master password to encrypt the passwords file.

For the project, you have to implement several new system calls:

For each user, a Logfile is maintained in the user's home directory. This file keeps track of all activities - file creation, open, read, write, deletion. The log information for each access should include the username, file name, the action (create, open, read, write, close, delete), any key failures, and the timestamp. The logging is accomplished as follows: Within each of the above system call's implementation, invoke fork() or pthread_create() and let the child process take care of storing the log information. The parent process may or may not wait until the child process terminates (i.e. successfully logs) - you decide. Once you implement the above system calls, you will write a set of sample driver programs (at least 2) that invoke the above calls for a different set of files. Assume that all the files you test with reside in the user's home directory. Here are a set of suggested driver programs:

You can be more creative and write other/better driver programs.

Mechanics, and what to hand in

If a group works on a project, then in general we will assign a common score to both participants. If your group did not work out well for the Project 1 phase, then you are free to work independently. Please make sure that the group is identified in the README you turn in, and that only one member of the group submits the project!

There are two steps to what you will hand in:
  1. The Design Documentation is due by 11.59PM on Nov. 23, 2003. We are enforcing this deadline to ensure that people don't leave the project until the last minute. You are, of course, welcome to visit either the faculty or TA office hours for help; however, one of the first things we'll ask for is your design documentation (unless you're asking for help with that...). You may make changes to your documentation before the full Project handin; however, the design portion of your grade will depend heavily on the design document you hand-in on November 23th.

    Your design documentation, typically 3-5 pages for a project of this size, should include the basic design of your software (the modules that you will write, their functionality and rough psuedo-code, where will you make changes to the kernel etc.), a timeline, as well as details on the testing that you plan to do to ensure that your code works.

    Submit using the online submit command (class section-specific as before) and name your file Project2-Design.txt or Project2-Design.pdf - NO OTHER FORMATS Please.

  2. The Final Code and Results are due by due by 11.59PM on Dec. 8 (MONDAY), 2003 (extended to Dec 9th, Tuesday, 11.59PM).
    Submit a single TAR and gzipped file that contains:
    1. The patch file. Also, try the patch commands way ahead of submission deadlines and let us know of any problems. Make sure that you have a backup copy of ALL your modified kernel source files before you use the diff command.
      Create a patch file using diff command:

      diff -crP /usr/src/linux /usr/src/mylinux > /tmp/mypatch.diff

      where mylinux is *YOUR* project directory for the second project. Please remember that the patch file should be generated after you remove any object files by using a make clean or make mrproper. Also, make sure you use the right options to diff.

    2. Create a new folder and copy ALL the kernel-level .c and .h files that you created/modified for this project and tar, gzip to create Project2-Source.tar.gz
      This file is a backup to your patch file.
    3. The driver programs (any .c, .cc and .h files)
    4. The encrypted File_of_Keys and File_SuperKey files used in your testing and the super password (listed in the README file).
    5. The input files used for the driver programs.
    6. Sample Output showing the execution of your driver program (this can be obtained by running the 'script' command and submitting the file 'typescript' generated as output by script.)
    7. Performance Study Report: A 1-2 page report reporting your tests, including performance measurements. We suggest you measure the time taken to read/write files of different sizes (at least 4 sizes ranging from 10 Kbytes to 1 Mbyte) with and without encryption to figure out how much overhead the encryption process causes. Present the numerical results in neat graph or tabular forms.
    8. A README file (explaining how to compile/run your program) - Include any "gotchas" with your program that you are aware of. If the programs wont compile correctly, be honest and state them here.
    9. A COMMENTS file (describing your experience with the project and any suggestions/feedback to the instructors).

Helpful Hints

Grading the Project

The grading for the project will be as follows: 40% design, 50% implementation, 10% testing. We have structured the grading in this way to encourage you to think through your solution before you start coding, and realize that testing your implementation is an important part of any software development process. If all you do is to work out a detailed design for what you would do to address the assignment (and if the design would work!), but you write no code, you will still get almost half of the credit for the assignment. Conversely, if you implement correctly, but do not prove that by testing your code, you will still not be given complete credit. Tests should convince us of two things -- firstly that your implementation works and secondly how much overhead it adds to the file operations.

The implementation portion of the grade considers whether you implemented your design and provided documentation that the TA could understand. Part of being a good computer scientist is coming up with simple designs and easy to understand code; a solution which works isn't necessarily the best that you can do. Thus, part of the design and implementation grade will be based on whether your solution is elegant, simple, and easy to understand.

We suggest that you do the project in two phases. First, just add in the new functions without doing any complex encryption. Use something simple -- we suggest a substitution cipher, with the key indicating a shift. So a key of 4 will mean that A becomes E, B becomes F, ... Z becomes D and so on. Successfully completing this phase will entitle you to 75% of the implementation credit. The remaining 25% will come from using a "real" crypto algorithm such as Blowfish or 3DES.

There are several extra credit opportunities available, with the extra credit varying from 5 to 25 percent of the total. For a small amount of extra credit, encrypt not just the contents but even the names of the files. For more extra credit, allow the user to specify not just the encryption key but also the encryption algorithm on a per file basis.

The intent of the grading for the project is not to differentiate among those students who do a careful design and implementation of the assignments. Rather, the grading helps us identify those students who (i) don't do the assignments or (ii) don't think carefully about the design, and therefore end up with a messy and over-complicated solution. Remember that you can't pass this course without at least making a serious attempt at each of the assignments. Further, the grading is skewed so that you will get substantial credit, even if your implementation doesn't completely work, provided your design is logical and easy to understand. This means that you should first strive to come up with a clean design of your project on paper. Second, don't try to add fancy features because some other group is doing so!

Rules for Collaboration

It is OK for you to discuss general approaches with other groups. It is NOT OK to exchange solutions -- ideas or code. Please recall that academic dishonesty will be sternly dealt with.

Good Luck and Have Fun!