Blue Screen Appears on a Node Running a GPGPU Job
Updated: May 18, 2011
Applies To: Windows HPC Server 2008, Windows HPC Server 2008 R2
If a blue screen occurs on a compute node that is executing a long-running general purpose computation job on a graphics processing unit (GPU) computing processor that uses a Windows Display Driver Model (WDDM) driver, you may need to modify or disable the timeout detection and recovery registry setting for the GPU on each compute node.
To disable the timeout detection and recovery registry setting, under HKLM\System\CurrentControlSet\Control\GraphicsDriver, set TdrLevel to 0. For more information, see Timeout Detection and Recovery of GPUs through WDDM (http://go.microsoft.com/fwlink/?LinkId=196045).
|Incorrectly editing the registry may severely damage your system. Before making changes to the registry, you should back up any valued data on the computer.|