CMSC-421 Operating Systems

CMSC-421 Operating Systems

Spring 98

Project II

Adapted from Dr. Howard Motteler
Assigned:

Assigned: 8 April 98 Due: 11 May 98

Purpose

To gain experience with process management and data structures.

Background

A time sharing system, such as UNIXX has multiple users running multiple processes concurrently. A simplified process control system can be seen as a state diagram with three states. A process is either running, ready to run, or waiting for an I/O request to be filled. This makes sense. Even though there are multiple jobs, a CPU can only ever be doing one thing at one time, so it can only run one process. Since it can only run one and this is a time sharing system, there must be a number of processes that are ready to run. Since we want to utilize the CPU to the fullest extent we can and it takes time to fulfill an I/O request, we want to put currently running processes that request I/O to sleep until the I/O completes. These states are usually called "running", "ready to run", and "sleeping on I/O". The simple state diagram is shown below.

It is important to note that process scheduling gets done between chunks of "real work". The CPU happily chugs away running a single process for a specific amount of time. This is "real work". The system keeps time with a hardware clock that interrupts the CPU at a fixed, hardware-dependent rate, typically between 50 and 100 times a second. Each occurrence of a clock interrupt is called a clock tick. The system allows a process to execute on the specific number of clock ticks before it gets moved to another state. This number is called the system’s timeslice. After each clock tick, the CPU begins to execute operating system code, part of which is job scheduling. In the diagram above each circle contains none, one, or more processes and the processes are transitioned between circles during the operating system code that does job scheduling. The transitions are explained as follows:

Fork (new job): A new job is created by a process calling the fork() system call and is put in the ready to run state.

Schedule Process: A lucky process has been chosen to execute for the next timeslice.

Preempt: An unlucky process has run for it’s entire allocated timeslice, so it is put back into the ready to run state.

Sleep: The currently running process needs to do I/O before it can get any more work done. So the process is put to sleep until the I/O subsystem can retrieve the requested data chunk from an external source. (disk drive, keyboard, etc)

Wakeup: The data chunk the process was waiting for has arrived; the process can now continue to work, so it is moved into the ready to run state.

Exit: The process has finished its job.

Project

Your project is to implement this simplified process management scheme using 2 different scheduling techniques: prioritized round robin and shortest remaining time. You also need to keep track of various statistical data for each scheduling technique and job: total time each process was in each state, number of jobs run, total elapsed time, total elapsed time per job (throughput), and longest and shortest time taken for any job in the system.

The system will have the following behavior. You must have a way to keep time intervals (clock ticks) during the execution of the system. A simple approach to this problem might be to use a loop and, for each iteration, increment the clock. Seven should be done in the scheduling loop during each cycle, the first six of which pretty much map directly into the transitions of the diagram above. Your main loop will do something like the following:

If there are new jobs, then put them in "ready to run" state.
If the job currently running has used up its timeslice, then put it back into the "ready to run" state.
If the job currently running has requested I/O, then move it to "sleeping on I/O" state.
If a job was just swapped out, then choose a new job to swap in.
For all jobs waiting for I/O, check if the request has been satisfied.
If a job’s I/O has completed, move it to "ready to run" state.
Keep track of statistics.

For this simulation, jobs will do no "real" work; i.e., your simulation will implement what happens when the CPU invokes the operating system process scheduler. You can think of this project as a function call inside the operating system. A function call that updates all job states and sets up the next job to run, so that when the function call completes, the CPU has the next job and can begin to run it.

Since our simulated jobs are doing no "real work", we have a little problem. If they are not actually executing code, they can not request I/O. If they cannot request I/O, we can’t transition from running to sleeping on I/O. And since there is no real I/O going on, I/O can not complete, so we cannot transition from sleeping on I/O to ready to run. This is a problem. But we have a solution. When a job is in the running state you will have a function that randomly decides whether the job has requested I/O. When a job is in the "waiting for I/O" state, you will have a function that randomly decides whether or not I/O has completed for that job. These functions are given below.

#define CHANCE_OF_IO_REQUEST 10

#define CHANCE_0F_IO_COMPLETE 4

int IO_request() {

if( rand() % CHANCE_OF_IO_REQUEST == 0 )

return 1;

else

return 0;

}

int IO_complete() {

if( rand() % CHANCE_OF_IO_COMPLETE == 0 )

return 1;

else

return 0;

}

(Note: In order to use the rand() function, your simulator needs to seed the random number generator. Also, you must use #include <stdlib.h>)

Jobs in this simulation will be simulated using a job data file that contains 4 columns. The format is:

Process Id: Arrival time: Service time: Priority

For example the data file:

123:0:10:1

124:1:20:0

describes a simulation that has two jobs. The first has a PID of 123, an arrival time of zero (when the simulation starts), needs to do 10 clock ticks of work, and has a priority of 1. The second job has a PID of 124, arrives at clock time 1, needs to do 20 clock ticks of work, and has a priority of 0.

The behavior of all jobs in your simulator will handle is described in the data file attached. Associated with every job is a unique process id (PID), the arrival time for the given job, a service time ( how long a job has until completion, i.e., how long the jobs "real work" will take), and a priority rating (for this simulation there will be five priorities, numbered 0-4, with 0 being the highest priority). Please do not change this data file, your grade will be based on specific output generated by the given job attributes.

Implementation Structure

Since jobs can have differing priorities (0, 1, 2, 3, 4,5 ), it is suggested use five queues, numbered by priority. The jobs with priority 0 have higher precedence than those with priority 1, same with the queues. That means in order to start a job in queue Queue 1, there shouldn’t be any jobs currently in Queue 0. Keep in mind that this should be a time sharing system. This means that after a certain time, no matter if the current job has finished or not, that job should be put back on the line and the job next in line should begin. The timeslice for this project is 3 clock ticks. Also, for each job that the scheduler works on, that job’s time remaining should decrease by one. The scheduler should never be "idle" if there is an existing job in line. Whe jobs are completed, update any statistics and get the next one immediately. If a job is swapped back to "ready to run", it should be ut on the queue assigned to its priority.

Your project’s main function could look something like this:

Clock = 0;

Seed the random number generator

Add new incoming jobs to the ready to run state

While (there are jobs in the system ) {

current_job = choose job to execute

While (1) { /* this loop is the main process management loop */

Add new incoming jobs to the ready to run state

While there are jobs waiting for I/O

Status = IO_complete() /* check if this job’s I/O is complete */

If status == 1

Then put the job whose I/O has completed on the
ready to run queue

If this job has finished its work (if time remaining is 1 )

Then mark current_job as swapped out (job complete, exit)

If there are now higher priority jobs on the ready to run state

Then mark current_job as swapped out (preempted)

/* check for I/O request from current_job */

if the job is not complete

then status = IO_request()

if status == 1 /* need to do I/O */

then mar current_job as swapped out (sleeping on I/O)

else if this job has been on the CPU for an entire timeslice

then mark the job as swapped out (end of timeslice )
and move it to the ready to run state

do bookkeeping and statistics

clock++

if current_job was swapped out

then move the job to the appropriate queue and break from this loop

}

Note: The choose job to execute line will be your 2 different scheduling functions. (If the inner loop above is well thought out, it should not have to change when you scheduling algorithm changes.) Also make sure you account for the case that no processes will run. In this case, run an idle process ( no statistics need to be generated for this pseudo process.)

Implementation Notes

The project needs to be written in C.

The timeslice is 3 clock ticks.

The random seed is 1.

Your program must specify which scheduling algorithm to use on the command line: "-P" is prioritized round robin, and "-L" is least time remaining (shortest job first).

The simulation needs to be deterministic so the TA can grade it without having to check everyone’s output by hand. In order for this to happen, the calls to rand() inside IO_request and IO_complete need to be called in the same order in everyone’s implementation. The order of the calls is given in the above pseudocode for your main. I.e., for each iteration of the inner loop you must first call IO_complete, in a FIFO fashion, for all currently waiting jobs in the I/O queue, then call IO_request for the current job. Your implementation must follow this order. Also everyone’s implementation needs to call srand() before the main loop is started. To be deterministic everyone will use "1" as the seed to srand().

The output should be standard and follow the format given below:

| Total time | Total time | Total time

Job# | in ready to run | in sleeping on | in system

| state | I/O state |

========+=================+=================+==============

pid0 | XXX | XXX | XXX

pid1 | XXX | XXX | XXX

pid2 | XXX | XXX | XXX

pid3 | XXX | XXX | XXX

…

pid0 | XXX | XXX | XXX

Total simulation run time: XXX

Total number of jobs: XXX

Shortest job completion time: XXX

Longest job completion time: XXX

Average job completion time: XXX

Your program is required to produce verbose output if given the "-v" flag on the command line. The verbose output will be written to stderr every clock tick and consist of the following one line format.

#while the system has jobs

For example, the job file above would produce the following verbose output to stderr:

0:123:9:false:none:preempted

1:124:19:false:none:still running

2:124:18:false:none:still running

3:124:17:false:none:still running

4:124:16:false:true:none:sleeping

5:123:8:false:none:still running

6:123:7:false:0:preempted

7:124:15:true:none:sleeping

…

For example, assume the following things happen during clock tick 45: job 123 is running, it does not request I/O, it has 15 clock ticks until it exits, it has not used its entire timeslice, and the I/O completes for jobs 234, 345, and 456. The verbose output for that cycle would be:

45:123:15:false:234,345,456:still running

If no I/O had been completed, it would be

45:123:15:false:none:still running

Your simulator must read the input file from stdin and write its output to stdout. So the program will be run as follows:

% my_simulator < input_file.txt > output_file.txt

It would be faster and easier to test your simulator, if all the inputs can be changed from the command line, i.e.,

% my_simulator –v –ts 3 –s 1 –r 10 –c 4 < input_file.txt > output_file.txt

Where ts is the time slice, s is seed value, r is I/O request chance, c is I/O complete chance, and verbose output is on. But by no means is this required.

If you are interested in the browsing the source code for the LINUX process scheduler, it can be found here in /kernel/sched.c.

Good luck!!

Submitting the Project

To be announced.

Due Date

The project is due on Monday, 11 May. Projects must be submitted before midnight of the due date. (That is, the project must be submitted by 11:59:59 p.m. on 11 May.)