Principles of Operating Systems

CMSC 421 - Fall 2019

Project 1

Due by 11:59PM EDT on ~~October 27, 2019~~November 1, 2019

Changelog

October 26, 2019: Due date extended to November 1st.

October 13, 2019: Fix prototypes of system calls for incorrect data types. Thanks to Mariela Guardado-Martinelli on Piazza for noticing the mistake.

Introduction/Objectives

In this project, you will create a new version of the Linux kernel that adds various system calls to implement a new data structure inside the kernel for use in Interprocess Communication. This assignment is designed to allow you to create new functionality in the kernel through the use of system calls, and to learn how to interact with the kernel's handling of memory management.

Before you begin, be sure to create your new GitHub repository for Project 1 by using the link posted on the course Piazza page. Then, follow the same steps you did in Project 0 to clone this new repository (obviously, substitute project1 for project0 from the earlier instructions). Also, you may remove the /usr/src/vanilla-project0 directory and create a new /usr/src/vanilla-project1 directory with the newly checked out code.

As a first step, change the version string of the new kernel to reflect that it is for Project 1. That is, make the version string read 5.2.11-cmsc421project1-USERNAME (substituting your UMBC username where appropriate), much like you did in Project 0.

Incremental Development

One of the nice things about using GitHub for submitting assignments is that it lends itself nicely to an incremental development process. As they say, Rome wasn't built in a day — nor is most software. Part of our goal in using GitHub for assignment submission is to give all of the students in the class experience with using an source control system for incremental development.

You are required in this project to plan out an incremental development process for yourself — one that works for you. There is no one-size-fits-all approach here. One suggested option is to break the assignment down into steps and implement things as you go. For instance, the locking/thread safety portions of the assignment can be easily added after the main functionality is implemented, in most cases. You are also encouraged to seek out the review of your TAs to determine whether an approach might be feasible.

You should not attempt to complete this entire project in one sitting. Also, we don't want you all waiting until the last minute to even start on the assignment. Students doing either of these tend to lead to getting poor grades on the assignment. To this end, we are requiring you to make at least 4 non-trivial commits to your GitHub repository for the assignment. These four commits must be made on different dates and at least one must be done before Wednesday, October 16^th. You may make more than four commits during the timeline of the project — four is simply a minimum number required for full credit. In addition, you must have made a reasonable attempt at implementing the basic operations on the data structure described later in this document by October 16^th. That is to say that we expect to see some version of code that is capable of initializing the skip list data structure, inserting items into it, deleting items from it, and searching for existing items by October 16^th. This may be in the form of kernel-space code or in the form of a user-space prototype (as detailed below).

In addition, as it is significantly easier to build and test code in user-space than it is in the kernel, you are encouraged to build a user-space prototype of at least part of your system before attempting to implement it in the kernel. Please note that we are not suggesting you implement the entire project in user-space, but rather just some of the basic functionality. One approach that many previous students have found that works for this is to implement a prototype version of the data structure that you will be using in user-space. This will allow you to ensure that you have the basic algorithms for implementing insertion, deletion, search, etc. working without having to spend a long time waiting on kernel builds. Some portions of the assignment are not feasible to be done in a user-space prototype, as the interfaces for doing so are significantly different in user-space and in kernel-space (like locking, access control, and actual IPC), so we do not necessarily recommend building a full prototype of the whole system in user-space. If you do choose to implement a user-space prototype of your code, please place it in a directory called proj1proto in the root of the Linux kernel source code and be sure to tell us about it in your README.proj1 file (let us know what you did in the prototype, why you chose to do so, how things changed when you ported it over to the kernel, etc).

A non-trivial commit is defined for this assignment as one that meets all of these requirements:

Does not contain only documentation (i.e, just committing a README file does not count).
Does not contain only Makefile modifications or creation.
Modifies/creates at least 10 lines of code in a combination of existing or newly created .c or .h files. That is to say, creating a new file with 10 lines of code counts, but creating a new file with a 10 line comment does not.
Code modifications/creation must be relevant to the project. Creating a bunch of useless files/functions that are unrelated or otherwise superfluous to the assignment does not count. It is ok to reorganize your code after you have started and remove pieces of code, of course, but if you are obviously only adding code to the repository early on that you completely delete later (or that has nothing to do with the assignment), then that commit will not count toward the requirements herein.
Code implementing a prototype version of the assignment (for instance, a user-space version) does count, but any example code or anything else of that nature that you use in that prototype (for instance, a user-space version of the <linux/list.h> header file) does not count.

Failure to adhere to these requirements will result in a significant deduction in your score for the assignment. This deduction will be applied after the rest of your score is calculated, much like a deduction for turning in the assignment with a late penalty.

Skip Lists

In this assignment, you will be working with a data structure known as a Skip List. This data structure combines the amortized performance of binary search trees (that is to say O(log n) search, insertion, deletion, etc) with the relative simplicity of singly-linked lists. As this data structure is probably not one that you have studied in the past (for instance in CMSC 341), the following resources will be helpful to you in figuring out how to implement it in code:

Skip Lists: A Probabilistic Alternative to Balanced Trees - the original research paper outlining how a skip list works
Concurrent Maintenance of Skip Lists
Skip list Wikipedia article
A Skip List Cookbook - another paper on skip lists as written by the original author of the first paper on this list.

In this assignment, you will be implementing a probabilistic skip list. You need not worry about rebalancing the skip list or resizing the number of pointers in the list after it is created, regardless of how many nodes are added to the list.

As probabilistic skip lists require some sort of random number generation in order to determine when to add an additional level of pointers to a given node, you will need to have some code responsible for generating random numbers. For simplicity, you may use the segment of code defined below for this purpose (based on code from the C11 standard):

static unsigned int next_random = 9001;

static unsigned int generate_random_int(void) {
    next_random = next_random * 1103515245 + 12345;
    return (next_random / 65536) % 32768;
}

static void seed_random(unsigned int seed) {
    next_random = seed;
}

Please note that this segment of code will generate a random integer between 0 and 32767. You can then use this to help you determine how to proceed in creating your list (for instance, if on a given list, the probability of making a node have an additional level of pointers is 1/2, then you could say that values from 0-16383 promote the node, while anything beyond that will not). It is extremely important that you do not do any floating-point arithmetic inside the kernel, so be very careful while writing the code to determine if a node gets an additional level of pointers to not do any floating-point arithmetic! Also, be sure to properly seed your random number generator at some point with an appropriate seed value.

You are not required to use the random number generation code included above if you do not wish to do so. However, you should ensure that whatever you use for random number generation for the creation/maintenance of your skip lists are available in the default kernel configuration that is created by the instructions in Project 0 if you wish to use some other part of the kernel to accomplish this.

Each node in your skip list will be given an ID, which should be used as the key with which you will sort the list. You must ensure that all IDs are unique. Nodes within your skip list are to be used as a mailbox type data structure to hold a FIFO queue of messages. Each mailbox may have an "unlimited" number of messages within in it, and you may have an "unlimited" number of total mailboxes as well.

Message Queues

As mentioned above, each node in your skip list is to act as a mailbox, each containing a FIFO queue of messages. The following image may help you to visualize what we are asking you to do in this assignment:

In this image, boxes containing an 'M' are messages added to a queue. The data structure above the chains of messages is the skip list, sorted by the ID of each queue. Each box above the ones with numbers are a level of pointers within the list. The ground symbol (three lines shaped like a triangle) represents the end of a message queue.

We make no requirements on how the queues themselves are implemented, other than requiring that they be implemented in a First-in, First-out (FIFO) manner. You may choose to reuse part of your skip list code for implementing the queues themselves, or you may use some other functionality, such as the kernel's built-in linked list code.

Access Control

Only the root user may call the creation and deletion system calls below, as well as those that modify the access control lists for mailboxes. If a regular user attempts to call any of the prohibited system calls, they must be given an error indicating that they have been denied that permission, like -EPERM.

At creation, a mailbox will have no access control list. Process IDs may be added to a mailbox's ACL only by the root user. Attempts to add or remove process IDs to a mailbox are not gated by the ACL — that is to say that as long as the process attempting to add or remove an entry from an ACL is running with an effective or real user ID of root, that the system call should be allowed to perform its work. Any attempt to modify an ACL by a process that is not running with an effective or real user ID of root should be denied access to do so with an error such as -EPERM.

If there is no ACL set up on a mailbox, then any process may access that mailbox. That means that any process can send or receive messages from the mailbox, count the number of messages in that mailbox, or get the length of the message in the front of the mailbox's queue. If an ACL has been set up on a mailbox, only processes with process IDs identified in the ACL or those owned by someone with an effective or real user ID of root may perform any actions on the mailbox — root always has access to the mailboxes, regardless of what is on the ACL. This also means that once an ACL has been created on a mailbox and then subsequently has all of the process IDs removed from it that only root may access that mailbox (until another process ID is added to the ACL).

New System Calls

You will be adding a few new system calls to the Linux kernel in order to manage the skip list of message queues and the messages within each queue. The mailboxes and their contents all exist only in the kernel's address space (all data must be stored in buffers allocated in the kernel — do not attempt to store any user-space pointers in the data structure you create). You will develop the system calls specified below in order to access the mailboxes and their contents.

Each mailbox should be identified by an unsigned long ID value, which is passed in at creation time (as well as into all functions that access the mailbox later). Each mailbox must be able to store an "unlimited"^* number of messages, each of which is of "unlimited"^* length. Messages must be treated as binary data and not as ASCII/text strings — this means that messages may have embedded NUL (0-value) bytes. To this end, ensure that you do not attempt to use any string functions such as strcpy() or strlen() to perform the functionality we've described here.

^*: Message length and the number of messages stored in the system are still, of course, limited by the size of the memory on the system. "Unlimited" here means that you may not artificially limit the size of messages or the number of messages in your code beyond the obvious limits.

Each mailbox will store its messages in FIFO order. That is to say that each mailbox may be seen as a queue.

Please keep in mind that these functions may well be called from multiple different processes simultaneously. If you provide any state information in kernel-space for these functions you must provide appropriate locking in order to ensure their correct operation!

As this code will be part of the kernel itself, correctness and efficiency should be of primary concern to you in the implementation. Particularly inefficient (memory-wise, algorithmic, or other poor coding choices) solutions to the problem at hand may be penalized in grading. In regard to correctness, you will probably find that the majority of your code for this assignment will be spent in ensuring that arguments and other such information passed in from user-space is valid. If in doubt, assume that the data passed in is invalid. Users tend to do a lot of really stupid things, after all. Crashing the kernel because a NULL pointer is passed in will result in a significant deduction of points.

Finally, you are to implement this system on your own — no group work is allowed on this assignment. You are welcome to use the kernel's basic linked list functionality in assisting you to create your skip lists and queues, if you so desire (but you are not required to do so in either or both cases).

The signature and semantics for your system calls must be as follows (and they must be added to the system call table in the following order):

long mbx421_init(unsigned int ptrs, unsigned int prob): Initializes the mailbox system, setting up the initial state of the skip list. The ptrs parameter specifies the maximum number of pointers any node in the list will be allowed to have. The prob parameter specifies the inverse of the probability that a node will be promoted to having an additional pointer in it (that is to say that if the function is called with prob = 2, then the probability that the node will have 2 pointers is 1 / 2 and the probability that it will have 3 pointers is 1 / 4, and so on). The only valid values for the prob parameter are 2, 4, 8, and 16 — any other value shall result in an error being returned. Additionally, the ptrs parameter must be non-zero — a zero value should result in an error being returned. Returns 0 on success. Only the root user (the user with a uid of 0) shall be allowed to call this function.
long mbx421_shutdown(void): Shuts down the mailbox system, deleting all existing mailboxes and any messages contained therein. Returns 0 on success. Only the root user shall be allowed to call this function.
long mbx421_create(unsigned long id): Creates a new mailbox with the given id if it does not already exist (no duplicates are allowed). Returns 0 on success or an appropriate error on failure. If an id of 0 or (2⁶⁴ - 1) is passed, this is considered an invalid ID and an appropriate error shall be returned. Only the root user shall be allowed to call this function.
long mbx421_destroy(unsigned long id): Deletes the mailbox identified by id if it exists and the user has permission to do so. If the mailbox has any messages stored in it, these messages should be deleted. Returns 0 on success or an appropriate error code on failure. Only the root user shall be allowed to call this function.
long mbx421_count(unsigned long id): Returns the number of messages in the mailbox identified by id if it exists and the user has permission to access it. Returns an appropriate error code on failure.
long mbx421_send(unsigned long id, const unsigned char __user *msg, long len): Sends a new message to the mailbox identified by id if it exists and the user has access to it. The message shall be read from the user-space pointer msg and shall be len bytes long. Returns 0 on success or an appropriate error code on failure.
long mbx421_recv(unsigned long id, unsigned char __user *msg, long len): Reads the first message that is in the mailbox identified by id if it exists and the user has access to it, storing either the entire length of the message or len bytes to the user-space pointer msg, whichever is less. The entire message is then removed from the mailbox (even if len was less than the total length of the message). Returns the number of bytes copied to the user space pointer on success or an appropriate error code on failure.
long mbx421_length(unsigned long id): Retrieves the length (in bytes) of the first message pending in the mailbox identified by id, if it exists and the user has access to it. Returns the number of bytes in the first pending message in the mailbox on success, or an appropriate error code on failure.
long mbx421_acl_add(unsigned long id, pid_t process_id): Adds PID specified by process_id to the ACL for the mailbox with the ID of id if it exists and the user has access to do so. Returns 0 on success or an appropriate error code on failure. Only the root user shall be allowed to call this function.
long mbx421_acl_remove(unsigned long id, pid_t process_id): Removes the PID specified by process_id from the ACL for the mailbox with the ID of id if it exists and the user has access to do so. Returns 0 on success or an appropriate error code on failure. Only the root user shall be allowed to call this function.

You may add additional system calls to aid in your development (like, say maybe one to print out the state of the mailbox system to the kernel log), however you may not require these system calls to be called in order to make your system work properly (for instance, you can't add a "reset" system call that has to be called after init before a mailbox is added to the system). Additionally, any extra system calls you include MUST appear after the ones we have outlined above in the system call table or you will fail nearly every testcase we will throw at your code! If you add any additional system calls, please tell us about them in your README.proj1 file.

Remember from Project 0 that system calls are defined in the kernel by way of using a SYSCALL_DEFINEn macro, where n is the number of arguments that the system call takes in. So, for instance, the mbx421_send syscall from the list above would be defined like shown below:

SYSCALL_DEFINE3(mbx421_send, unsigned long, id, const unsigned char __user *, msg, long, len) {
    /* Code goes here. */
}

Each system call returns an appropriate non-negative integer on success, and a negative integer on failure which indicative of the error that occurred. See the <errno.h> header file for a list of error codes. The following error codes would be sensibly used by your code (this may not be an exhaustive list):

-EINVAL: invalid value passed (that doesn't fit into one of the below cases)
-ENODEV: mailbox system not initialized (or has been shut down and has not subsequently been re-initialized)
-EEXIST: mailbox already exists
-ENOENT: mailbox doesn't exist
-EPERM: permission denied
-EFAULT: bad pointer passed or couldn't read/write pointer
-ESRCH: no messages in mailbox
-ENOMEM: memory allocation failure

As this system is designed as a IPC system, messages must be copied into properly allocated kernel memory when sent and copied back into user-space memory when received. Also, please note that there is no requirement that messages be textual strings (so do not assume that messages will be NUL terminated), or that they have any content (zero-length messages are to be considered valid, however messages with negative lengths are not valid and shall generate an error if one is attempted to be sent).

User-space driver program(s)

You must adequately test your kernel changes to ensure that they work under all sorts of situations (including in error cases). You should build one or more testing drivers and include them in your sources submitted. Create a new directory in the Linux source tree called proj1tests to include your test case program(s). Be sure to include a Makefile to build them and instructions on how to run them in a README file within this directory. Your README for the test programs should also describe your general strategy for testing the system calls. Remember that testing is one of the primary jobs of a developer in the real world!

It is strongly suggested that you additionally build a separate program for each system call to be implemented to simply call that system call with user-provided arguments. For the data to be sent as a message, you might consider allowing the user to specify a file of data to send or a string on the command line. These programs will likely prove to be invaluable in debugging.

Submission Instructions

You should follow the same basic set of instructions for submitting Project 1 that you did for Project 0. That is to say, you should do a git status to ensure that any files you modified are detected as such, then do a git add and a git commit to add each modified/newly created file or directory to the local git repository. Then do a git push origin master to push the changes up to your GitHub account.

Be sure to include not only your modified kernel files, but also your driver program files. The driver should go in a proj1tests directory, in the root of the Linux source tree. You must include a Makefile that can build your test program(s) in this directory as well. You should not attempt to add your test directory to the main Linux Makefile. Also, include a README file in the proj1tests directory describing your approach to testing this project. Tell us what your testcases actually test, and why you chose to test those things. If your testcases are supposed to fail at any point, make sure to tell us that in the README (after all, you should not only test your code with good inputs, but with bad ones too — we'll do just that in our testcases).

You must also include a README.proj1 file in the root directory of the Linux source code that describes anything you might want us to know when we're grading your assignment. This can include an outline of how you implemented the requirements of the project, for instance. This is also where you should cite any references you have used for the assignment other than those given in this assignment description.

You should also verify that your changes are reflected in the GitHub repository by viewing your repository in your web browser.

References

Below is a list of references that you may find useful in your quest to complete this project:

The Linux Kernel API — documentation of the internal API for programming in the Linux kernel (please note that in the User Space Memory Access chapter, only certain functions are covered (which do "less checking"), the versions that do "more checking" are named the same, but without leading underscore characters)
The Linux Cross-Reference (for version 5.2.11 of the kernel) — a cross-referenced copy of the Linux kernel source code for relatively easy searching
The Open Group Base Specifications Issue 7/IEEE Std. 1003.1 - 2017, 2018 Edition/POSIX.1-2017
The Unreliable Guide to Locking [in the Linux Kernel] — a decent resource, despite its name
Kernel Korner: System Calls — old and outdated, but still demonstrates the concepts used for system calls

If in doubt, the Kernel API and Linux Cross Reference should be your ultimate guides.

What to do if you want to lose points on this project

Not pushing changes to your GitHub repository by the project due date.
Excessive unnecessary changes made to the kernel sources.
Extraneous files are included.
Files are missing that needed to be modified.
Hello World system call included, or the system calls required are otherwise out of the order specified.
Failure to follow the requirements in the "Incremental Development" section of the assignment.

Please do not make us take off points for any of these things!

Last modified Tuesday, 01-Sep-2020 18:43:34 EDT