Frequently asked questions and common pitfalls¶
1. I want to use clsim on a CPU-only system, but my administrator is giving me a hard time and does not want to install OpenCL on this system. I don’t have super-user privileges. Is there anything I can do to install OpenCL myself?
Yes, there is! You can install the AMD Accelerated Parallel Processing (APP) SDK in a user directory and configure your system to make IceTray pick up the library from your custom path.
To do this, you need to first make some directory into which you are goin to install OpenCL:
$ mkdir /home/username/OpenCL $ cd /home/username/OpenCLDownload the AMD APP SDK from here: https://sourceforge.net/projects/nicehashsgminerv5viptools/files/APP%20SDK%20A%20Complete%20Development%20Platform/ (You should download a file named similarly to
AMD-APP-SDK-v2.8-lnx64.tgz
orAMD-APP-SDK-v2.8-lnx32.tgz
, depending if you are on a 32bit or 64bit system. Depending on your release of the APP ADK, your file and directory names will be slightly different. Just usels
to see what you’ve got. This example is for 64bit systems.). At the moment of writing this, the current file can be downloaded like this:$ wget 'http://developer.amd.com/wordpress/media/2012/11/AMD-APP-SDK-v2.8-lnx64.tgz'Now unpack what you just downloaded:
$ tar xzf AMD-APP-SDK-v2.8-lnx64.tgzThat should give you two new archive file. Unpack those, too:
$ tar xzf icd-registration.tgz $ tar xzf AMD-APP-SDK-v2.8-RC-lnx64.tgzThe last step is to tell IceTray where to look for OpenCL:
$ export OPENCL_VENDOR_PATH=/home/username/OpenCL/etc/OpenCL/vendors $ export LD_LIBRARY_PATH=/home/username/OpenCL/AMD-APP-SDK-v2.8-RC-lnx64/lib/x86_64:$LD_LIBRARY_PATH(This step would not be necessary if your administrator would install these things into system paths. Specifically,
$OPENCL_VENDOR_PATH
has a default of/etc/OpenCL/vendors
.)Now, you should be able to run cmake as usual:
$ cd $I3_BUILD $ cmake ../srcYou should see it pick up OpenCL:
-- opencl -- + CL/cl.h found at /scratch/ckopper/AMD-APP-SDK-v2.8-RC-lnx64/include -- + /scratch/ckopper/AMD-APP-SDK-v2.8-RC-lnx64/lib/x86_64/libOpenCL.soIf you don’t see this, make sure
libOpenCL.so
can be found in the path you added to $LD_LIBARY_PATH. As always, you may have to delete$I3_BUILD/CMakeCache.txt
to force cmake to look for new tool locations.
2. Why do I get angry emails from my cluster adminstrator about using too many threads or causing a really high load when submitting clsim jobs on a cluster (when running on CPUs)?
All existing OpenCL implementations for CPUs (either Intel’s or AMD’s version at the moment) will parallelize photon propagation, which means that they will run multiple threads for a single clsim process. On multi-core systems, CPU usages of 1000% or more have been observed. This is great for getting results quickly on an interactive system, but it is a bad idea when using it on cluster nodes where multiple processes are being run at the same time. (Almost all of the existing batch schedulers assume that a process will only ever use a single CPU core/thread.)
In order to configure clsim to only use a single core for photon propagation, the standard tray segment provides the option
DoNotParallelize
which you should set toTrue
when submitting to a cluster:tray.AddSegment(clsim.I3CLSimMakeHits, "makeCLSimHits", DoNotParallelize=True, UseGPUs=False, UseCPUs=True)
How can I make sure that my clsim instance is using the correct GPU in a multi-GPU system?
By default, clsim will take over all available GPUs whenever the
UseGPUs
option is set toTrue
. The easiest way to select a specific GPU on cluster nodes that have multiple GPUs available rely on the submitting users to restrict their jobs to a certain GPU is using theCUDA_VISIBLE_DEVICES
environment variable. (This is specific to nVidida and will not work with AMD cards.) You just setexport CUDA_VISIBLE_DEVICES="0,2,3"
This example will make clsim use devices number 0,2,3 and will skip device number 1. (The numbering starts from 0.) The process will actually only see three of the four hypothetical devices.
On a well-configured cluster, this variable should already be set for each running job.
4. So I’m running on a condor cluster that does not set CUDA_VISIBLE_DEVICES automatically (npx3 in Madison), what’s a good way to set it? I do get something called a “_CONDOR_SLOT” which is a number starting at 1 that I am supposed to use to select the GPU.
You can try adding the following block of python code to your script before any of the IceTray stuff. It will substract 1 from the slot id and set CUDA_VISIBLE_DEVICES to use the appropriate GPU.:
import os if "_CONDOR_SLOT" in os.environ: # running in condor? if "CUDA_VISIBLE_DEVICES" in os.environ: print "running in CONDOR, but CUDA_VISIBLE_DEVICES is already set. no further configuration necessary." else: condorSlotNumber = int(os.environ["_CONDOR_SLOT"]) print "script seems to be running in condor (slot %u). auto-configuring CUDA_VISIBLE_DEVICES!" % condorSlotNumber os.environ["CUDA_VISIBLE_DEVICES"] = str(condorSlotNumber-1)
5. I am running clsim on an nVidia GPU, but it seems to hang and is not generating photons.
When looking at the GPU utilization using nvidia-smi
, I do see 100%, but nothing seems
to be happening.
This is a known bug in very old OpenCL libraries supplied by nVidia. clsim is known not to work with driver versions 260.19.21 and older. Versions 270.x.x and newer have been tested and are known to work, so just update to the most recent version available and clsim should work.