Creating a UNIX shell
Due by 11:59 PM on Sunday, Sep 27
Changelog
- September 13, 2020: Added clarification for the Makefile
- September 11, 2020: Added examples at the end of the page
- September 11, 2020: Added clarification for getenv
Introduction
In this assignment, you will be producing a simple *nix shell program. This assignment only requires a few very basic features of a shell, and leaves out much of the functionality that more advanced shells such as Bash include. That is not to say that this assignment will be easy — this is a 400-level course, after all. There are still several parts of this assignment that could trip you up, especially if you are not comfortable with lower-level C programming.
This project is designed to help you get a bit of a warm-up with C, including topics that will be of extreme importance going forward with your programming projects. This assignment is to be completed entirely in user-space — not within the Linux kernel source code and does not involve recompiling the kernel. You may complete this assignment outside of your VM for the class if you wish, however the TAs will be using a setup like your VM in order to grade the assignment (so you should at least compile/run your submitted assignment once in your VM to ensure it works before submitting it).
What is a shell?
At it’s core, a shell is a piece of software that will get a string like ls -la
from the user, format it, then attempt to execute it. ls is an actual program located in /bin for instance. When you type ls
in your shell, it finds that program and executes it. The shell itself does not execute any logic that says: for each file in directory: print filename or something like that. It just invokes ls to do it’s own thing. Yes modern shells have a ton of extra bells and whistles but you should not worry about them in this assignment. You do not have to create any GUIs, windows, or anything of the sort.
Requirements
For this assignment, you will only need to support a few very basic features of a full-fledged *nix shell. Specifically, you will need to have support for all of the following:
- If run with no arguments, the shell shall present the user with a prompt at which they can enter commands. Upon the completion of a command, it should prompt the user for a new command.
- If run with any arguments, the shell shall print an error message to
stderr
explaining that it does not accept any command line arguments and exit with an error return code (1). - The shell shall accept command input of arbitrary length (meaning you cannot set a hard limit on command length).
- Parse command-line arguments from the user's input and pass them along to the program that the user requests to be started. The command to be called will be either specified as an absolute path to a program binary (like
/bin/ls
), as the name of a program that resides in a directory in the user's $PATH environment variable (likegcc
), or as a relative path (for instance, if we are in the/usr
directory, we could typebin/gcc
as a command to run/usr/bin/gcc
. In addition, your argument parsing code must properly handle escape sequences and quoting. That is to say that the input/bin/echo Hello\nWorld
should be parsed into two pieces — the program name/bin/echo
and one argument to that program containing the stringHello World
with an actual newline character in place of the space (and no quotes). - The shell shall support reading environment variables with a built-in
getenv
command. This command shall accept a single command line argument which shall be the name of the environment variable that the user wishes to read. If more than one argument is provided, the command shall print an error message tostderr
. Similarly, if no arguments are provided, you should print an error tostderr
. If a single argument is provided that names an existing environment variable, the content of that variable shall be printed on a new line onstdout
in the shell before returning to the normal prompt. If an argument is provided that does not name an existing environment variable, then a blank line shall be printed tostdout
before returning to the normal prompt. - The shell shall support setting environment variables with a built-in
setenv
command. This command shall accept two command line arguments. The first argument shall be the name of the variable to set and the second shall be what value to set that variable to. The second argument shall properly handle escape sequences and quoting as needed. If a number of arguments other than two is provided to thesetenv
command, an error shall be printed onstderr
. Otherwise, the command shall produce no output and continue with normal command parsing. No escape parsing shall be done on the first argument. - The shell shall support a built-in
exit
command. This command shall accept zero or one arguments. If provided with zero arguments, the shell shall exit with a normal exit status (that is to say, it shall exit with a status of 0). If provided with one argument, it shall attempt to parse that argument as an integer. If this parsing fails, the command shall be ignored and the shell shall prompt for another command as normal. If the parsing succeeds, the shell shall exit with a status of whatever integer the argument parses as. In either case of the shell exiting, it MUST clean up all memory it has allocated before exiting, along with ensuring that any child processes it has created have exited. - The shell shall not leak memory after it is done with it. The valgrind program can be your friend while debugging this program (unlike projects that are done in the kernel). We will also be using valgrind to test your implementation
You are not expected to support any of the following features:
- Scripting control features (like if statements or loops)
- Use of environment variables in commands (other than getting and setting them as described above)
- Support for pipes (including stdin/stdout redirection)
- Built-in functionality that is often part of a *nix shell (such as implementations of common utilities like cd), other than what has been outlined above
- The ability to change directories or anything else of the like
- Running programs in the background or resuming backgrounded programs
To sum up what you are expected to implement in this project:
- Present the user with some sort of prompt at which the user may enter a command to execute
- Parse out the program the user is attempting to call from its arguments and build an appropriate argument array which can be used to execute the program
- Determine if the program specified is a built-in (getenv, setenv, or exit) and handle those functions without creating a new process or attempting to execute another program
- If the program specified is not a built in, your shell must create a new process to execute the new program in, and pass in the correct arguments to one of the exec family of functions to execute the program with the arguments provided. Your shell then must wait for the newly created process to finish executing. Your shell must also handle the case in which a program cannot be executed properly and print out an appropriate error message on the stdout I/O stream
- Once the specified built-in or program has been executed (or failed executing), your shell should prompt the user for another command to run (unless the shell has exited from the exit built-in command)
Dos and Don’ts
Dos
Here is a list of functions that are worth to take a look at. You don’t necessarily have to use all of them, depending on how you implement your shell:
- fgetc
- malloc
- realloc
- free
- strtok_r
- strchr
- isspace
- fork
- exec (this is a whole family of functions)
- fprintf
- getenv
- setenv
Additionally, you probably want to use some potentially useful functions that we have provided for this assignment. There are two files; utils.c and utils.h. Particularly useful in this code are the functions unescape (which removes escape sequences and quotes from strings) and first_unquoted_space which will tell you the location of the next space in the string that is not quoted or part of an escape sequence. You are not required to use this code in your shell if you would rather implement this part yourself. If you do use this code in your shell, be sure to add the file to your git repository, just like any other source code you write.
Don’ts
Your shell program is not allowed to use any external libraries other than the system’s C library. Do not try to use libraries like Readline. You will lose points for using external libraries to implement shell functionality! You are not allowed to use any of the following functions in the C library to implement your shell:
- system (insecure and can lead to major problems)
- scanf (this one is largely to save you trouble)
- fscanf (ditto)
- popen (there is no reason you should need this, since pipes aren't to be supported)
- readline (in case this wasn't obvious from the above ban on external libraries)
You are not allowed to implement any of your shell’s functionality by calling on another shell to do the work. You must do the argument parsing and calling of programs in your own code!
Submission
When submitting your shell program, please be sure to include the source code of the shell program (in one or more C source code files), as well as a Makefile that can be used to build the shell. Your shell must be able to be built and run on a VM as has been set up for this course in your projects. Also, you should include a README file describing your approach to each of the requirements outlined above. Additionally, your program must be compiled to a binary called simple_shell with the Makefile you provide.
If you would like a template for use as a Makefile for your shell, we have provided one here: Makefile.
To submit your project, you must first accept the assignment on GitHub. The link to do so is posted on the course Piazza. Once you have done that, make sure that your project files (any source files and your Makefile) are in their own directory, then run the following commands in that directory (substituting the list of files you need to commit for your_files_go_here and your GitHub username for username, of course):
git init
You only do this once. Not every timegit remote add origin git@github.com:umbc-cmsc421-fa2020/project1-username.git
You also do this only once.git add your_files_go_here
git commit
git push origin master
Hints
The code that is provided to you is very useful. It is highly suggested that you use it in your shell.
The first_unquoted_space
function provided can greatly ease the work of parsing a string into arguments. For instance, on the input /bin/echo "Hello World"
, the function would return 9, which is the index of the first space. If run on the remainder of the string after that space, it would return -1, telling you that there are no further spaces in the string that are not quoted.
The unescape function allocates memory. If it returns non-NULL, you must free the value that it returns when you no longer need it. Additionally, the second argument to unescape should probably always be stderr
.
Yes, you can and will lose points on this assignment for memory leaks. A shell is intended to be a long-running program and thus it is very important not to leak memory. Also, this is meant to provide practice for your later projects in the course, where memory leaks can be very problematic.
Examples
Here are a few commands that you can use and expand on. Please note that these do not test every edge case:
ls
/bin/ls -la
setenv MESSAGE "Hello, world"
getenv MESSAGE
setenv MESSAGE Hello,\ \"Lawrence\",\ How" are you today?"
getenv MESSAGE
getenv PWD
echo \x48\151\x20\157\165\164\040\x74\x68\x65\x72\x65\041
echo Goodbye, \'World\'\a
And these are the expected results of running those in simple_shell. Obviously, there will be some differences when you run them on your machines, such as usernames and directories: