CMSC 421 - Project 2

April 16, 2019: Added a few resources and clarified that paths passed to the block file system call will be absolute paths

Introduction/Objectives

In this project, you will create a new version of the Linux kernel that adds a few security features to the system to help protect it from external network-based attacks and to help protect against unwanted access to files by users of the system. These features will be implemented by way of modifying existing kernel code to hook into your code, as well as system calls that will be used to set up and maintain your new security system. You will also be responsible for planning out your system's design and producing a system design document to accompany your final project submission.

To implement the functionality described in this project, you will need to find the code in the kernel responsible for handling both network-related system and filesystem related system calls and modify them. You will not be able to use iptables, netfilter, SELinux policies, the kernel audit framework, or other similar technologies to implement the functionality of the project as described. Please do not try. Using any of these will result in heavy point reductions for not actually implementing the functionality asked for.

This project will require a significant time commitment to complete successfully. We have included two milestones along the way to help ensure that you are devoting time to this project early and often.

Before you begin, be sure to create your new GitHub repository for Project 2 by using the link posted on the course Piazza page. Then, follow the same steps you did in Project 0 to clone this new repository (obviously, substitute project2 for project0 from the earlier instructions). You may remove the /usr/src/vanilla-project1 directory to create additional space, however we do not recommend removing your /usr/src/project1 directory until you receive your grade for that project.

Required Design Documentation

Throughout the timespan of this project, you will be producing two design documents to explain the design of your project. The first of these will be an informal document, just to ensure that you are on the right track with your system and to demonstrate that you have thought through the system early and often.

All documentation for this project will be submitted through your GitHub repository, just like your code. You will place all documents in a new directory within the /usr/src/project2 directory called proj2docs. Documentation must be formatted as either plain ASCII or UTF-8 text documents (with a .txt extension) or as PDF documents (with a .pdf extension). Other document formats, such as .doc, .docx, .odt, .rtf, and other such formats are not acceptable. Your preliminary design document must be named prelim.txt or prelim.pdf. Your final design document must be named design.pdf (because of the format of the final document, you will not be able to submit an ASCII/UTF-8 plain text document).

During the first few weeks of the project, you will be required to submit a preliminary design document for the TAs to review. This document should be approximately 2-4 pages in length, and should cover topics such as how you intend to store data in the kernel, how you intend to handle any locking necessary for the system, and where you intend to make modifications to the existing kernel code (specific files and functions) to implement any checks needed by the system you will be implementing. There are many different ways to implement the functionality that you will be writing, so please think carefully about how you will be implementing it.

Once you have planned out your design, you should begin implementing it as soon as possible. The coding portion of this project shouldn't require an overly large number of lines of code, however you will need to think carefully where to put any checks inside the kernel code. You will probably try many different things before you actually get one that works!

Once you have turned in your preliminary design document, the TAs will review your design for feasibility and make comments on the design to help you to ensure that the design will ultimately be successful. Upon your receipt of the TAs review of your document, you should review the comments and revise your preliminary design document as appropriate (feel free to update the document in your GitHub repository, if you like). Your revisions will help you in the preparation of your final design document at the end of the project.

Along with the final submission of the project, you will submit a system design document in a format like that of the Software Engineering (CMSC 447) course at UMBC. This will be a formal version of the system design as you implemented it, and should include descriptions of how testing was performed as well as both a high-level and a low-level discussion of how your system was implemented. We will be providing a template for your use in preparing this document.

The template for your final design can be found here (in Microsoft Word format). Please remember that your final design must be submitted as a PDF, not as a Word document — we're only giving you the template in Word format so that it is easily editable.

Incremental Development

One of the nice things about using GitHub for submitting assignments is that it lends itself nicely to an incremental development process. As they say, Rome wasn't built overnight — nor is most software. Part of our goal in using GitHub for assignment submission is to give all of the students in the class experience with using an source control system for incremental development.

You are encouraged in this project to plan out an incremental development process for yourself — one that works for you. There is no one-size-fits-all approach here. One suggested option is to break the assignment down into steps and implement things as you go. For instance, the locking/thread safety portions of the assignment can be easily added after the main functionality is implemented, in most cases. You are also encouraged to seek out the review of your TAs to determine whether an approach might be feasible.

You should not attempt to complete this entire project in one sitting. Also, we don't want you all waiting until the last minute to even start on the assignment. Doing either of these will usually lead to students getting poor grades on the assignment. To this end, we are requiring you to make at least 4 non-trivial commits to your GitHub repository for the assignment. These four commits must be made on different dates. You may make more than four commits during the timeline of the project — four is only the minimum that will be required for full credit.

A non-trivial commit is defined for this assignment as one that meets all of these requirements:

Failure to adhere to these requirments will result in a significant deduction in your score for the assignment. This deduction will be applied after the rest of your score is calculated, much like a deduction for turning in the assigment with a late penalty.

System Description

Your task in this project is to design and implement a basic port-based firewall in the kernel, as well as a new access control mechanism for files in the Linux kernel. Network communication and filesystem access share many common traits, and are very closely related in general. For instance, the read() system call can be used both to read from a file, as well as to read input packets from a network connection. Many of the system calls in *NIX-like systems are similar in this regard (after all, "everything in UNIX is a file"). Thus, you will probably find that the changes you make for one of the two systems you'll be implementing are going to be very similar to those you'll be implementing for the other.

Basic Network Firewall

The network firewall that you will be implementing in this project will be required to block applications from binding to specific ports, or making outgoing connections to specific ports. These restrictions will apply equally to any network connections started after a port is blocked. Maintenance of the list of blocked ports will be done through system calls that will be restricted to only being able to be called by the root user (or rather, by programs started by the root user).

The firewall system you are to design will block specific UDP or TCP ports, either in an incoming or outgoing direction. The system you design must work both with IPv4 and IPv6 connections (you don't have to worry about IPX or other obscure/outdated network protocols, however). The protocols (UDP/TCP), port numbers, and directions (incoming/outgoing) will be specified by the system administrator (root user) by way of calling a new system call that you will implement as part of the project. You are only required to block connections created after the system administrator has blocked a port with the system call. You should not block any connections that were created prior to the call being made to block a port.

Your firewall must be able to block an "unlimited" number of ports (within reason, since it there are only 65,536 ports available on each of TCP and UDP). You may choose how to store data for the firewall in the kernel, but you should consider the efficiency of whatever method you choose (not only in speed, but also in memory usage). Particularly inefficient implementations may result in lost points.

Your firewall will also be responsible for keeping track of how many times programs attempt to access all the blocked ports after they are blocked. This count must be kept on the same basis as all the rest of the firwall (so on a <port, direction, protocol> basis). You will provide a system call (available only to the root user) to query this information from the firewall.

The firewall's implementation will involve finding the portions of the code in the Linux kernel responsible for network communications and modifying them. It is suggested that this be done at the system call level (i.e, find the system call(s) responsible for binding to a port for incoming connections and block things there, find the system call(s) responsible for connecting to a remote port and block things there). This will provide the most general solution to the project, and avoid unnecessarily changing things in multiple places for supporting multiple protocols.

File Access Control Mechanism

The second portion of this project will involve adding code to the kernel to prevent access to a list of files (as specified by the system administrator/ root user) by any user of the system (including root). As with the firewall portion of the assignment, you need only block accesses to files that are opened after the root user has called the system call to block a file. You should not attempt to interrupt any access to files descriptors already open when the block call is made.

This access control mechanism is to be applied in addition to any already existing control mechanisms supported by the Linux kernel, such as UNIX file permissions and access control lists. It is not meant to replace these existing mechanisms, but rather to supplement them.

Your file control mechanism will apply to any attempt to open or otherwise query the metadata about any files that are blocked by the system. This includes preventing the creation of files/directories if they are specified to be blocked, as well as blocking all access to existing files and directories. However, your file control mechanism does not have to prevent files in the blocked list from appearing in directory listings (for instance in the output of the ls command) — you may block the files from appearing in directory listings, but you are not required to.

Your file control mechanims will also be responsible for keeping track of how many times programs attempt to access all the blocked files after they are blocked. Each file should have its own counter in your system. The root user will be able to query your system for each blocked file to find out how many times it has been accessed.

This project makes no requirements about the data structures used to implement the list of files or store the information required — you may choose any method that makes sense to you. However, you should ensure that your method is relatively efficient. Remember that files are frequently accessed on a running system and thus your code will be frequently accessed as well.

New System Calls

You will be adding several system calls to maintain and query your two new security subsystems in the kernel. All of the system calls that are listed here must only be callable by the root user (the user with a UID of 0). If any other user attempts to call these system calls, you should deny their access and return a permission denied (-EPERM) error.

The system calls and your security subsystem must be made thread safe. As these calls will be necessarily modifying shared state between multiple processes and threads, you must provide appropriate locking to ensure the safe operation of the calls. You should carefully consider how you will provide locking for the calls, and include this in the design documentation that you will be producing. Ineffective or inefficient locking mechanisms will result in a reduction of points on the project.

You must number the system calls in the order they appear here in this document. Providing them out of order in the system call table will result in a significant deduction of points on the project. If you wish to implement additional system calls to aid in debugging, you may do so (and leave them in your final submission of the project), but they must be after all of the required system calls in the system call table. In addition you are not allowed to require us to call any of your additional system calls for any reason.

As this code will be part of the kernel itself, correctness and efficiency should be of primary concern to you in the implementation. Particularly inefficient (memory-wise, algorithmic, or poor locking choices) solutions to the problem at hand may be penalized in grading. In regard to correctness, you will probably find that a large portion of your code for this assignment will be spent in ensuring that arguments and other such information passed in from user-space is valid. If in doubt, assume that the data passed in is invalid. Users tend to do a lot of really stupid things, after all. Crashing the kernel because a NULL or otherwise bad pointer is passed in will result in a significant deduction of points.

Finally, you are to implement this system on your own. The access control mechanisms extant within the kernel will not be helpful to you in implementing this assignment.

For proto in all of the firewall related calls, the value will be one of the symbolic constants IPPROTO_TCP or IPPROTO_UDP as defined in the standard C library. Outgoing implies a dir of 0, where incoming implies a dir of 1.

Please note that the error conditions defined above may not represent all possible error conditions for the functions. Use appropriate error codes if you detect an issue with any input (for instance, returning -EFAULT if a bad pointer is passed to any functions that use a pointer).

User-space driver programs and testing

You must provide a user space driver program to access each of the system calls that we have asked for in this project. The user-space driver programs must be named the same as the system calls specified above, and should accept the same number of arguments as the calls specified above. These driver programs are to be used by the root to manipulate your security systems. As an example of how each of these might be called, see the listing below.

These programs should print an appropriate error message (such as that returned by the strerror() or perror() library functions) if an error occurs in their operation (so, we would expect both of the unblock programs in that listing to print an error message if this list of commands is run on boot). Other than the query programs, none of the other programs need have any output, but they may print out a success message if you wish. On success, the programs should return 0, and on failure they should return EXIT_FAILURE, as defined in <stdlib.h> in the C library.

These programs should be placed in a proj2driver directory in your kernel source tree. There should be a Makefile provided to build all of the programs (that is to say that running make in the proj2driver directory should result in the eight programs specified being built and available in that directory).

In addition, you must adequately test your kernel changes to ensure that they work under all sorts of situations (including in error cases). You should build one or more testing drivers and include them in your sources submitted. Create a new directory in the Linux kernel tree called proj2tests to include your test case program(s). Be sure to include a Makefile to build them. In addition, provide a README in this directory describing your approach to testing your code.

Extra Credit

There will be a possibility to receive extra credit on this project. The extra credit consists of three separate parts. The first two parts may be performed individually and are each worth 5 points of extra credit on this assignment. The third part will require that you have completed the other parts as part of its functionality, and will be worth 20 points of extra credit on its own. In total, you can earn 30 points of extra credit on this assignment by completing all three parts of the extra credit.

The extra credit for this assignment will require that you have completed the main goals of the project to a satisfactory degree. That is to say that if you do not submit part of the project that is required (such as one of the design documents), you will not be able to get any points for work on the extra credit portion of the assignment.

If you attempt any portion(s) of the extra credit, please inform the TA that you have done so by noting it in your final design document as well as in any README files you create for the assignment. If you do not tell us that you have completed the extra credit, you will not receive any credit for it.

Part 1 — Reading blocked files/ports from a file on boot

Part 1 of the extra credit is to read the list of blocked files and blocked network ports from a file on the disk in your VM on startup of the system. To do this, you will need to modify your kernel code to look for the files /421-blocked-files and /421-blocked-ports after the root filesystem has been mounted (but before the login prompt is presented). The formats of these files will be described below. The code for reading these files must be implemented in the kernel — you may not implement this functionality in user-space (for instance by setting up a script to call your system calls on boot).

If the /421-blocked-files file exists during kernel bootup (after the root filesystem is mounted), then it shall be read. Each line of the file will either consist of a comment (starting with a # character), a blank line, or an absolute path on the system. Lines starting with a # character shall be ignored in their entirety, as shall blank lines. Each line containing an absolute path shall be added to your file control system as a blocked file. These files shall remain blocked until the system is rebooted (so, they may not be unblocked by the fc421_unblock_file() system call). An example configuration file is shown below:

# Example file control configuration file.
# This file will be read on boot to initialize your project 2.

# Block access to bob's file.txt
/home/bob/file.txt

# Block access to some secret files and directories
/opt/secret/keys.txt
/opt/secret/certs.txt
/opt/secret/top_secret_directory
/opt/secret/confidential_directory

If the /421-blocked-ports file exists on boot (after the root filesystem has been mounted), it shall be read to initialize the firewall portion of your project. Like the file control portion of this extra credit, lines will consist of either comments (starting with a # character), blank lines (which shall be ignored as well), or firewall entries. Firewall entries shall consist of a protocol (udp or tcp, case-insensitive), a direction (in, out, or both; case-insensitive), and a port number between 1 and 65535. As with the file control portion of the assignment, these ports shall remain blocked until the system is rebooted (they shall not be able to be unblocked by way of the fw421_unblock_port() system call). An example firewall configuration file is shown below.

# Firewall configuration file
# No HTTP daemons for you!
tcp in 80
tcp in 443
tcp in 8080

# No browsing the internet for you!
tcp out 80
tcp out 443
tcp out 8080

# No running a dns server
tcp in 53
udp in 53

Part 2 — procfs support

Part 2 of the extra credit will require you to create and mount an entry in the /proc filesystem for the file control and firewall portions of the assignment. Writing a file in the format shown in part 1 of the extra credit to each of these files should add all of the entries in the file to the relevant system (ignoring blank lines, comments, and any entries already in the system). Reading the proc file shall print out the status of the relevant system in the format described below. Only root shall be able to read or write to the proc files, just as only root can call the relevant system calls. The proc file for the file control portion of the assignment shall be called /proc/fc421 and the one for the firewall shall be called /proc/fw421. The status printed out shall consist of one line for each blocked entity, along with the number of times that entity has had an access attempted to it while the system has been up.

/home/bob/file.txt 0
/opt/secret/keys.txt 100
/opt/secret/certs.txt 42
/opt/secret/top_secret_directory 1337
/opt/secret/confidential_directory 17

tcp in 80 2
tcp in 443 2
tcp in 8080 70
tcp out 80 123456
tcp out 443 31337
tcp out 8080 42
tcp in 53 1
udp in 53 2

Part 3 — Implementation as a loadable kernel module

The third part of the extra credit is to implement a modified version of your firewall and file control system as a loadable kernel module. The modified version of the firewall and file control system will not be controlled by way of system calls, but rather by the configuration files and the procfs files as described in Parts 1 and 2 of the extra credit (as you cannot add system calls to the kernel from a module). The configuration files from Part 1 shall be loaded when the kernel module is loaded into the system to initilialize the system, and then its state shall be readable and changeable by way of writing to the procfs entries.

Your kernel module-based system must run on an unmodifed copy of the 4.18.6 Linux kernel. You may either create this unmodified version by building your vanilla-project2 (or, if you did not create a vanilla-project2, you may clone the https://github.com/UMBC-CMSC421-FA2018/linux repository to get a clean version (yes, that link is correct even though it refers to last semester).

The kernel module-based system must work on the same principles as the normal project -- intercepting the system calls as they are called in some way. This is quite difficult from a kernel module (and you won't be able to directly copy your approach from the original assignment to do this either, since you cannot modify the system calls' code directly).

If you wish to attempt this portion of the extra credit, you must have already completed the base project as well as Parts 1 and 2 of the extra credit. Once you have done so, use the link posted on Piazza to create your blank repository to put your kernel module code in. Please do not click the link on Piazza if you are not going to make a serious attempt to implement this part of the extra credit.

Submission Instructions

You should follow the same basic set of instructions for submitting Project 2 that you did for Project 0. That is to say, you should do a git status to ensure that any files you modified are detected as such, then do a git add and a git commit to add each modified/newly created file or directory to the local git repository. Then do a git push origin master to push the changes up to your GitHub account.

Be sure to include not only your modified kernel files, but also your driver and test program files.

You should also verify that your changes are reflected in the GitHub repository by viewing your repository in your web browser. This will ensure that you have pushed all requisite documentation to the GitHub repository and that the TAs will be able to see it by the time each document is due. Please make sure that you not only commit but push each changeset!

References

Below is a list of references that you may find useful in your quest to complete this project:

If in doubt, the Kernel API, Kernel Module Programming Guide, and Linux Cross Reference should be your ultimate guides.

Principles of Operating Systems

CMSC 421 - Spring 2019

Project 2

Changelog