Principles of Operating Systems

CMSC 421 — Spring 2021


Project 1

100 points

Due by 11:59:00 PM EST on Sunday, February 28, 2021

CHANGELOG

February 16: Fix typos and slightly clarify examples/unescaping

February 13: Added incremental development requirements

February 13: Initial version

Introduction

In part 1 of this assignment, you will be writing a C program that works as a simple *nix shell program. This part of the assignment only requires a few very basic features of a shell, and leaves out much of the functionality that more advanced shells such as Bash include. That is not to say that this assignment will be easy — this is a 400-level course, after all. There are still several parts of this assignment that could trip you up, especially if you are not comfortable with lower-level C programming.

In part 2 of this assignment, you will be adding functionality to explore information from the /proc filesystem on Linux and to organize it in a way to present it to the user.

This project is designed to help you get a bit of a warm-up with C, including topics that will be of extreme importance going forward with your programming projects. This assignment is to be completed entirely in user-space — not within the Linux kernel source code and does not involve recompiling the kernel. You may complete this assignment outside of your VM for the class if you wish, however the TAs will be using a setup like your VM in order to grade the assignment (so you should at least compile and run your submitted assignment once in your VM to ensure it works before submitting it).

You are however not allowed to use the GL system (or any other shared environment) at UMBC to complete this assignment! You can get into real trouble for this!

This assignment must be completed in the C programming language (you can choose to use C89/C90/ANSI C, C99, or C11 (with POSIX extensions) as you see fit — please don't try to torture yourself with pre-ANSI/traditional/K&R C)

What is a shell?

At its core, a shell is a piece of software that will get a string like ls -la from the user, format it, then attempt to execute it. ls is an actual program located in /bin for instance. When you type ls in your shell, it finds that program and executes it. The shell itself does not execute any logic that says: for each file in directory: print filename or something like that [1]. It just invokes ls to do it’s own thing. Yes modern shells have a ton of extra bells and whistles but you should not worry about them in this assignment. You do not have to create any GUIs, windows, or anything of the sort.

[1]: While some shells do have such functionality built-in, your shell for this assignment will not have such functionality built in, except for two very specific things that are described below.

Beginning the project

To begin this project, you must click the appropriate link for your section on Piazza to create and populate your private git repository for the project. Once the repository has been created, you can then clone the repository onto your VM and begin your work with the command shown below, replacing username with your GitHub username as appropriate:

git clone git@github.com:umbc-cmsc421-sp2021/project1-username.git
This repository will come pre-populated with the template from the course website, an outline of a suggested structure for part 1 of the assignment, and some helpful code to get you started.

Incremental development

One of the nice things about using GitHub for submitting assignments is that it lends itself nicely to an incremental development process. As they say, Rome wasn’t built in a day — nor is most software. Part of our goal in using GitHub for assignment submission is to give all of the students in the class experience with using a source control system for incremental development.

You are required in this project to plan out an incremental development process for yourself — one that works for you. There is no one-size-fits-all approach here. One suggested option is to break the assignment down into steps and implement things as you go. You may also seek out the TAs during office hours to ask if your approach seems feasible. We are here to help.

You should not attempt to complete this entire project in one sitting. Also, we don’t want you all waiting until the last minute to even start on the assignment. Students doing either of these tend to lead to getting poor grades on the assignment. To this end, we are requiring you to make at least 4 non-trivial commits to your GitHub repository for the assignment. These four commits must be made on at least two different dates (you can do two on one day and two on another, in other words) and at least one must be done before Sunday, February 21st at 11:59:00 PM EST. You may make more than four commits during the timeline of the project — four is simply a minimum number required for full credit.

A non-trivial commit is defined for this assignment as one that meets all of these requirements:

Failure to adhere to these requirements will result in a significant deduction in your score for the assignment. This deduction will be applied after the rest of your score is calculated, much like a deduction for turning in the assignment with a late penalty.

Part 1

Requirements:

For this assignment, you will only need to support a few very basic features of a full-fledged *nix shell. Specifically, you will need to have support for all of the following:

  1. If run with no arguments, the shell shall present the user with a prompt at which they can enter commands. Upon the completion of a command, it shall prompt the user for a new command.
  2. If run with any arguments, the shell shall print an error message to stderr explaining that it does not accept any command line arguments and exit with an error return code (1).
  3. The shell shall accept command input of arbitrary length (meaning you cannot set a hard limit on command length).
  4. The shell shall parse command-line arguments from the user's input and pass them along to the program that the user requests to be started. The command to be called will be either specified as an absolute path to a program binary (like /bin/ls), as the name of a program that resides in a directory in the user's $PATH environment variable (like gcc), or as a relative path (for instance, if we are in the /usr directory, we could type bin/gcc as a command to run /usr/bin/gcc. In addition, your argument parsing code must properly handle escape sequences and quoting for all arguments you have parsed (including the program name). That is to say that the input /bin/echo Hello\nWorld should be parsed into two pieces — the program name /bin/echo and one argument to that program containing the string "Hello World" with an actual newline character in place of the space (and no quotes). If the command entered has an invalid escape sequence, the command has unmatched quotes, or some other error condition that the unescape function detects, ignore the whole command input and prompt again.
  5. The shell shall support a built-in exit command. This command shall accept zero or one arguments. If provided with zero arguments, the shell shall exit with a normal exit status (that is to say, it will exit with a status of 0). If provided with one argument, it shall attempt to parse that argument as an integer. If this parsing fails, the command must be ignored and the shell must prompt for another command as normal. If the parsing succeeds, the shell shall exit with a status of whatever integer the argument parses as. In either case of the shell exiting, it MUST clean up all memory it has allocated before exiting, along with ensuring that any child processes it has created have exited.
  6. The shell shall not leak memory after it is done with it. The valgrind program can be your friend while debugging this program (unlike projects that are done in the kernel). We will also be using valgrind to test your implementation.

You can find examples of multiple commands that may be used to test your code at the bottom of this page.

NOTE: This is NOT an exhaustive list of commands that your shell must support. They are only given as examples. Any program installed on the VM will be runnable with your shell.

You are not expected to support any of the following features:

Part 2

/proc is a virtual file system in Linux. Most of the files appear to have zero length; however, you can view them using the cat command or an editor and see that there is data in many of the files. The reason is they are not real files, but pointers into the kernel. Thus, they are not taking up real space on disk. Most files are read only, but others you can modify and these will let you change the kernel's characteristics on the fly.

Read the article at the following URL to learn more about the /proc virtual filesystem: https://www.linux.com/news/discover-possibilities-proc-directory/.

In part 2 of this assignment, you will explore some of the information available in the /proc filesystem. To create a process to use in this assignment, from another terminal run the top command in the background (type: "top &" without the quotes to do so) and use the pid from this process for your exploration.

You are required to add functionality to your shell program that reads information from the /proc filesystem and display it on the normal stdout of the shell. This command shall accept a single command line argument that will be the name of the file from the /proc filesystem that the user wishes to read information from. Only a small subset of all of the files in /proc shall be available for use with this command, as detailed below.

Files from the /proc filesystem that your code shall support reading information from are as follows:

You may support more files in the /proc filesystem than those we have required here, but you MUST at least support the ones we have required.

The proc command shall be implemented by opening the file specified within the /proc filesystem, reading all input from the file, and printing all of that information to stdout, followed by a newline character. Your shell then shall prompt for a new command, as usual.

You are not required to parse the information you have read from the files in /proc at all prior to presenting it to the user. For instance, if you run the command proc cpuinfo, your output may look identical to running the command cat /proc/cpuinfo, however, you may not implement your code by actually running that command. You must implement the proc command by opening the appropriate file in the /proc filesystem, reading from it, and printing out its output to stdout.

To sum up what you are expected to implement in this project:

Dos and Don'ts

Dos

Here is a list of functions that you might find it worthwhile to take a look at from the C library. You don't necessarily have to use all of them, but you may find several of them useful for this assignment:

Additionally, you will want to make use of the utility functions that we have provided to you in the utils.c and utils.h files that are in your repository. Particularly useful in this code are the functions unescape (which removes escape sequences and quotes from strings) and first_unquoted_space which will tell you the location of the next space in the string that is not quoted or part of an escape sequence. You are not required to use this code in your shell if you would rather implement this part yourself (but, trust me, you do not want to write this yourself in all likelihood).

Don'ts

Your shell program is not allowed to use any external libraries other than the system’s C library. Do not try to use libraries like Readline. You will lose points for using external libraries to implement shell functionality! In addition, you are not allowed to use any of the following functions in the C library to implement your shell:

You are not allowed to implement any of your shell’s functionality by calling on another shell to do the work. You must do the argument parsing and calling of programs in your own code!

A header file is NOT a library. In order to add an external library you have to link against it. So to link with the threading library for example, you would have to add -lpthreads to your build command in the Makefile. So as long as you are not adding an -lsomething in your build, or copying code from an external library into your code you should be ok.

#include <stdio.h> is not using an external library!

Submitting the project

When submitting your shell program, please be sure to include the source code of the shell program (in one or more C source code files), as well as a Makefile that can be used to build the shell (we have already given you this Makefile in your repository). Your shell must be able to be built and run on a VM as has been set up for this course in your projects. Also, you must include a README.md file describing your approach to each of the requirements outlined above. Additionally, your program must be compiled to a binary called simple_shell with the Makefile you provide (again, the Makefile we have provided does this).

When you wish to commit files to your repository, you must add them with the git add command, commit them with the git commit command, and push them to GitHub with the git push origin main command. You may add/commit/push the same files many times over the course of the project (in fact, you will have to do so multiple times to avoid losing points on the assignment). For instance, if I were to modify the main.c file that is included in the repository as we gave it to you, I would run the following commands from the directory that the repository was cloned into:

git add src/main.c
git commit
# Give a good commit message explaining what was changed
git push origin main

You must not submit any build products to your repository. That means you must not ever do a git add build from the root of your repository or any other such command that would add the build products in your repository. Also, do not add any swapfiles created by editors or other stuff of the like.

Hints

The code that is provided to you is very useful. It is highly suggested that you use it in your shell.

The first_unquoted_space function provided can greatly ease the work of parsing a string into arguments. For instance, on the input /bin/echo "Hello World", the function would return 9, which is the index of the first space. If run on the remainder of the string after that space, it would return -1, telling you that there are no further spaces in the string that are not quoted.

The unescape function allocates memory. If it returns non-NULL, you must free the value that it returns when you no longer need it. Additionally, the second argument to unescape should probably always be stderr.

Yes, you can and will lose points on this assignment for memory leaks. A shell is intended to be a long-running program and thus it is very important not to leak memory. Also, this is meant to provide practice for your later projects in the course, where memory leaks can be very problematic.

Examples

Here are a few commands that you can use and expand on for testing your shell. Please note that this is obviously not an exhaustive list and does not test every edge case.

ls
/bin/ls –la
ps -el
proc 1/status
echo \x48\151\x20\157\165\164\040\x74\x68\x65\x72\x65\041
echo Goodbye, \'World\'\a
exit 0

Here is an example of what the output might look like on your terminal from running those commands (lines starting with $ show what the user has input at the prompt):

$ ls
CMakeLists.txt  Makefile  README.md  src
$ /bin/ls -la
total 60
drwxr-xr-x 6 lj lj  4096 Feb 13 17:34 .
drwxr-xr-x 3 lj lj  4096 Feb 13 17:13 ..
-rw-r--r-- 1 lj lj 15314 Feb 13 17:13 .clang-format
-rw-r--r-- 1 lj lj   958 Feb 13 17:30 CMakeLists.txt
-rw-r--r-- 1 lj lj   293 Feb 13 17:13 .editorconfig
drwxr-xr-x 8 lj lj  4096 Feb 13 17:35 .git
-rw-r--r-- 1 lj lj    19 Feb 13 17:13 .gitignore
drwxr-xr-x 3 lj lj  4096 Feb 13 17:13 .idea
-rw-r--r-- 1 lj lj  1236 Feb 13 17:31 Makefile
-rw-r--r-- 1 lj lj  2829 Feb 13 17:13 README.md
drwxr-xr-x 2 lj lj  4096 Feb 13 17:34 src
drwxr-xr-x 2 lj lj  4096 Feb 13 17:13 .vscode
$ ps -el
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
4 S     0     1     0  0  80   0 -   222 -      ?        00:00:00 init
5 S     0    99     1  0  80   0 -   222 -      ?        00:00:00 init
1 S     0   100    99  0  80   0 -   222 -      ?        00:00:00 init
4 S  1000   101   100  0  80   0 - 190697 futex_ pts/0   00:00:02 docker
1 Z     0   102    99  0  80   0 -     0 -      ?        00:00:00 init <defunct>
1 S     0   113    99  0  80   0 -   222 -      ?        00:00:00 init
4 S     0   114   113  0  80   0 - 329161 -     pts/1    00:00:00 docker-desktop-
5 S     0   140     1  0  80   0 -   222 -      ?        00:00:00 init
1 S     0   141   140  0  80   0 -   222 -      ?        00:00:01 init
4 S  1000   142   141  0  80   0 -  1780 do_wai pts/2    00:00:00 bash
0 S  1000   299   296  0  80   0 -  2753 -      pts/3    00:00:00 top
0 R  1000   350   142  0  80   0 -  2636 -      pts/2    00:00:00 ps
$ proc 299/environ
SHELL=/bin/bashWSL_DISTRO_NAME=DebianWT_SESSION=7db2b336-f609-4f16-b8d3-97cf50e3b759
NAME=akagiPWD=/mnt/c/Users/Lawrence SebaldLOGNAME=ljHOME=/home/ljLANG=en_US.UTF-8TER
TERM=xterm-256colorUSER=ljPATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sb
in:/bin:/usr/games:/usr/local/games=/usr/bin/top
$ echo \x48\151\x20\157\165\164\040\x74\x68\x65\x72\x65\041
Hi out there!
$ echo Goodbye, \'World\'\a
Goodbye, 'World'
$ exit 0