We are back to timing for this assignment. The memory bandwidth graph shown in class came from a paper by Michael Thomadakis: The Architecture of the Nehalem Processor and Nehalem-EP SMP Platforms (Figure 17). You should repeat that experiment using the processor and language of your choice.


Have one thread write some data, then time how long it takes a thread to read that data from the same thead or a different thread. To see the effect in the original graph, you will need your threads to run on different cores. Unfortunately, this is one of those things that many thread libraries like to pretend doesn't matter. You will need one that allows you to set the thread affinity, which allows you to specify which processor or core a thread should use. External packages do exist for Java or Python for this (google to find one). For C++, solutions include using SetProcessorAffinityMask for Windows or the pthread library in Linux.

You will also need to use a lock, semaphore, or other thread synchronization primitive to make sure the first thread is done writing before the second thread starts to read. Convert the time and amount of data to a bandwidth in MB/s, and repeat the measurement for successively larger data sizes. Plot two lines: writing and reading on same core, and writing and reading on different cores.

691 Students

Perform a second set of experiments comparing the lock free hash table described in class and in this blog post by Jeff Preshing to a standard hash table (e.g. unordered_map in C++) using a mutex to ensure thread-safety. For timing, have two threads simultaneously insert a number of key/value pairs into the same map.


Edit "assn5/README.txt" to describe your test computer (at least OS and CPU), how many cores our CPU has, what its cache sizes are, what language and compiler you used, what threading library or package, and how to build and run your project. Also include a link to a shared google spreadsheet containing a plot of your results.