MS Thesis Defense

A Hybrid CPU/GPU Pipeline Workflow System

Tim Blattner

11:45am Thurday, 25 April 2013, ITE 325b, UMBC

Heterogeneous architectures can be problematic to program on, particularly when trying to schedule tasks on all available compute resources, overlapping PCI express transfers, and managing the limited memory available on the architectures. In this thesis we propose a workflow system that is capable of scheduling on all available compute resources, overlaps PCI express transfers, and manages the limited memory. A procedure for creating the workflow system is described and two case studies are analyzed.

  • Image Stitching, which implements the workflow system and achieves two orders of magnitude speedup over an image stitching plugin found in the popular Fiji ImageJ application. Implementing the image stitching algorithm without the workflow system yielded only one order of magnitude speedup over the image stitching plugin.
  • Out of Core LU Decomposition, which does not implement the workflow system. This case study demonstrates the impact of the PCI express on a problem with a large number of dependencies. A proposed workflow system for this algorithm is provided in Future Work.

Using the workflow system, programmers have a method for scheduling any algorithm on all available compute resources and is capable of hiding the I/O impact by overlapping computation with I/O.

Committee Members: Milton Halem, Yelena Yesha, Shujia Zhou, John Dorband, Walid Keyrouz