User Space GPU Resource Allocation Mechanism

Project Description

In order to emulate the fly brain at near-real time speeds with increasing levels of biological accuracy, there is a need to effectively leverage multiple GPUs to execute a brain emulation. Achieving this goal is complicated by the fact that neural circuit emulations are not embarrassingly parallel, i.e., data communication between different parts of an emulated circuit must proceed throughout the duration of an emulation. We aim to design and implement a software mechanism for managing and allocating multiple GPU resources to a neural circuit emulation. Since NVIDIA's GPU architecture and CUDA programming environment possess virtually no native resource management features, our system must provide a means of quantifying the computational power of available GPUs and tracking their usage. This system must be able to take into consideration both different hardware configurations (which may potentially comprise heterogeneous assemblies of GPUs and combinations of local and remote GPUs) and the structure of the neural circuits that must be mapped onto them in order to obtain efficient resource usage.

Possible Project Goals

Provide a means of automatically discovering (or manually specifying in a configuration) available local and remote GPUs.
Provide a means of quantifying how factors such as hardware version, networking, etc. determine the computational resources available to the application using the mechanism
Create a GPU code execution mechanism that determines which resources to use to execute code.
Enable the use of different resource allocation policies

Skills Gained

Familiarity with cutting-edge features of GPU programming platforms.
Experience developing parallel software for platforms comprising both CPUs and GPUs.