This algorithm is so complicated that if you were to implement it in real life it would be really slow. Even though the theoretical analysis gives it a O(n) running time, it will probably be faster to just use Quicksort.The purpose of your project is to either support or refute Professor X's statement.

- The deterministic linear time Select algorithm given in Chapter 10 of the textbook.
- The randomized selection algorithm given in Chapter 10 of the textbook.
- Using randomized Quicksort to first sort the set of numbers, then picking the middle element of the sorted array.

In order for this to be a fair comparison, you must make each algorithm as fast as you can. The following is a minimal list of issues that you should address to make your implementation run as fast as possible:

- The Select algorithm in the textbook groups the items into segments of 5 items each. The number 5 was chosen purely for analysis purposes. Segments of 7, 9, 11, ... would also work. You should run experiments to determine the optimal segment size.
- The textbook does not describe exactly what happens when you
recursively call Select to find the median of the medians (Step 3,
page 190). You should
**not**copy the medians of the segments into a new array. This is unnecessary and wastes a lot of time. Your implementation of Select should be flexible enough that you can use it to find the median of an array from one index to another considering only every k-th item. - For small arrays it will be faster to use Insertion Sort to sort the items. You should run some experiments to find the optimal size to switch over to insertion sort and use this in your running time trials.

The internet is a useful source of ``random'' data. For example, you can download text or images and treat every four bytes as a 32-bit integer. Alternatively you can treat the binary code of an application (e.g., Microsoft Word) as a sequence of integers. Your report should fully document how you obtained the data for testing and give some indication of why you think this is a fair way to test your implementations.

The report should contain the following parts:

**Abstract:**An abstract is a short description of the contents of the report. The abstract should be written in the third person and should not be longer than half a page.**Introduction:**Describe the project, your approach and summarize the conclusions. A person who understands computer science but has not read this project description should understand this section.**Implementation:**Document how you implemented the algorithms. Report how the implementation issues described above were addressed.**Experiments:**Describe how you generated the data, which experiments you performed, how the running times were collected, etc. Report the timing results of your experiments in tables and graphs. The purpose of this section is to give enough information for the reader to repeat your experiments.**Conclusion:**State and justify your conclusions. Between Select and Quicksort, which algorithm is faster? Do you think your conclusions valid in general or just for the data and systems that you used?

**Report:**The report will be graded according to presentation, as described above.**Implementation:**This portion of the project grade depends on how well you implemented each algorithm. See the implementation issues discussed above.**Data:**The quality of the data used in your experiments count for 20% of your grade. If you have doubts about your approach in collecting data, ask your instructor well before the project deadline.**Correctness:**The implementation, reporting and experimental methodology should conform to this project description.**Conclusion:**You will not be graded on the statement of your conclusion, but on how convincing your conclusions are.

Last Modified: 1 Nov 1999 22:23:36 EST by Richard Chang to Fall 1999 CMSC 441 Section Homepage