UMBC CSEE

Computer Science & Electrical Engineering
University of Maryland Baltimore County
Baltimore Maryland 21250 USA
voice: 410-455-3500 fax: -3969

UMBC| CSEE| CSEE users| CSEE help

Project 1: Implementing a Simple Shell

Assigned: 2 June; Due:28 Oct

Project Goals The goals of this project are to learn the basics of how a "shell" or command interpreter works, and to gain experience programming with UNIX processes and system calls.

The Project

You are to design and implement a simple shell, "mysh". Some of the commands of this shell are to be internal and some are to external. The internal commands for project will be the command to list the directory (the UNIX "ls" command), the change directory command (cd) and the "exit" command).

The following is a prototype of what your main might look like:

while ( TRUE )					/* repeat forever           */
{
    read_command( command, parameters);	/* read input from terminal */

    if ( fork( ) != 0 )				/* fork off child process   */
    {
	  /* Parent Code */
	  waitpid( -1, &status, 0 )		/* wait for child to exit   */
    }
    else
    {
	  /* Child code */
        execve( command, parameters, 0 ); /* execute the command	    */
    }
}

Specifications

Your shell should be able to parse and execute command lines of the following form

cmd
cmd > file
cmd < file
cmd < file > file
cmd | cmd
cmd ; cmd
!!
!nr

where:

"file" is any valid UNIX filename.
"cmd" is a program name followed by zero or more arguments.
There will no wildcards, no in-line command evaluation, no macro substitutions, and commands and filenames which does not include the symbols ">", "<", ";" or "|".
There can be any of commands separated by the "|" or ";" tokens.
For the special case when "cmd" is "exit", your shell should terminate.
For the special case when "cmd" is "ls", "ls -i", "ls -l", or "li-li" your shell will list the files for the directory given in the "file" argument. When there is no "file" specified, you are to use all of the current working directory. For the -i option, you are to list the filename(s) and the inode number from the directory:
1267 foo
For the -l option, you are to display the filename(s) and the permissions of the file(s) in the format that is used on UNIX: -rwxrw-r--
-rwxrw-r-- foo
The combined version would be:
-rwxrw-r-- 1267 foo
You will find this information in the directory entry and the assocated inode struction.
Your shell should also be able to parse and execute command lines above ending with an "&" that puts the specified process or processes into the background. (With multiple commands, only put put the last command into the backgroud.)
You can assume that all command lines are of the form described here, and you do not have to check for any other sorts of input. In particular, you do not have to expand filename wildcards.
Use fork() and either execl(), execve() or execvp() to execute the commands. The man difference is that execvp() will search your PATH for the specified executable, and doesn't have an explicit environment argument. Additional information about these calls is available from the man pages.
Your shell should prompt the user for each command line with the string "mysh[n>". The number n is the number of the command )(for the history version).
Your shell will allow two forms of the history command. The first is the double explanation points ("bang-bang") which will repeat the last line of input that was typed in. The second version is the explanation point with a number ("bang-n") . The number is a command from the history buffer. You will be required to keep the last fifty commands in the buffer, and know the history number of each command.
The final version you submit should not print any test messages or other extraneous jabber. You should print errors returned by execvp() or execve(); you can use strerror(errno) to make your error reporting more informative; errno is an external variable set to the most recent error, and strerror() translates this to a human-readable string. You may find the UNIX string manipulation library routines (such as strtok) convenient for parsing the command line; do a "man string" to learn more about these functions.
To redirect input from a file to a program, so the file appears as "standard input" to the program (e.g., as in the command " wc < test.dat"), you will have to manipulate file descriptors. To connect a file to the UNIX "standard input" (by convention file descriptor 0), you can use the dup2()system call, as in the following example.
```
if (( fd = open ( fname, O_RDONLY )) == -1 )
{
    fprintf( stderr, "mysh error: can't open %s\n", fname);
    exit(1);
}

dup2(fd, 0);
close(fd);

if(execvp(cmd, args == -1 )
{
    fprintf(stderr, "mysh error: %s\n", strerror(errno) );
}
```
If we wanted to execute the command "wc -l", "cmd" in the above example would be the string "wc", and the string array "args" would have its first element pointing to "wc", its second element pointing to "-l", and its third element the null pointer 0, indicating the end of the arg array.
The "&" function is easy to implement--without the "&", the shell waits for the child process; with the "&" it does not wait.
The pipe function "|" is a little more difficult, and can be implemented with either a temporary file, or preferably with a real UNIX pipe.
The ";" is used to separate commands on the command line. Process the first part independently and when it is finished, then process the second part.

Grading

Make sure you have read the general information on programming projects. Approximately 20% of your grade is for documentation, and the remainder is based on how well your project works. You should describe your design and implementation at the beginning of the project; this initial description is worth 15% of your grade. Describe any non-trivial data structures you have used, and briefly say how each relevant routine acts on those data structures. Do not simply echo the specifications given here.

Do not use the system() system call or invoke the UNIX shell to implement your shell!

You must do the project described here. Doing some other similar or dissimilar project, matter how difficult or clever, may be worth 0 points. This is not a group project; please do your own work, and be careful about sharing your code. It is OK to discuss design issues, but in your documentation, you should give credit to your sources.

What to turn in

When your project is absolutely finished and you have completely and totally tested your work, you are to create a script file (the output is "typescript" as in the following example (where "$" is the UNIX prompt):

$ script
$ cat mysh.c
$ cc -o mysh mysh.c
$ mysh
mysh  /* run examples to prove all internal commands and
         all forms given about will work. */
mysh exit
$ exit

Mail the file typescript to the TA for grading. If your code is in more than one file, you will have to "cat" each file, and then "cat" the makefile, if you are using one.

Hints and Tips

To get started, you should read the man pages for fork, execl, execve, execvp, dup, wait, pip, getpid and any string functions you might want to use. ("man string" will bring up a summary page on the basic string functions.)
The functions execve and execvp take an array of strings as their arguments. This array is similar to the argv[] array passed to C programs. When you create this array, it is important that the end of the array is marked with a null pointer (Not a null string!); for example if there are k string pointers in your array args, you would set args[k]="\0".

You will probably need the following "include" files.

        #include <stdio.h>
        #include <unistd.h>
        #include <sys/types.h>
        #include <sys/stat.h>
        #include <fcntl.h>
        #include <string.h>
        #include <stdlib.h>

You should develop your code in a series of stages:

write and test a parser to read the command lines;
get the simple commands to work;
get the history portion to work;
get "&" to work;
get I/O redirection to work;
get the cd command to work;
get pipes to work.

Pipes are probably the most difficult part of this project. Remember that the output from one command is the input to the next command. Because time is compressed for the summer semester, let me give you a hint. You will have to capture the output from the first command in a temporary file. You can store the file in the directory /tmp, which is for that purpose. However, you will have to make sure that your filename is unique. Since you might have a file in that directory left over from an earlier session, so you need a way to guarantee a unique filename. You can use your login ID plus the PID of the current process. Also remember to remove the file when it is no longer needed!
You should be able to do this project on any system that is vaguely POSIX compatible, including LINUX, SGIs and Sun workstations. However, test it on UMBC9, which is where it will be graded.
This does not need to be a long project; you might want to think carefully about your overall design.

webmaster@cs.umbc.edu | UMBC | CSEE | User pages |