User Tools

Site Tools


techstaff:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
techstaff:slurm [2018/12/07 12:35] – [Using the GPU] kauffmantechstaff:slurm [2019/10/11 16:50] kauffman
Line 1: Line 1:
 ===== Notice ===== ===== Notice =====
-**2017-08-31**: Configuration change to allow allocation on CPUs and RAM. Please read the 'Default Quotasection under https://howto.cs.uchicago.edu/techstaff:slurm#usage  +**2019-10-08**: New computer nodes added under the partiton ''%%genfast%%''
 ====== Peanut Job Submission Cluster ====== ====== Peanut Job Submission Cluster ======
  
-We are currently **alpha** testing and gauging user interest in a cluster of machines that allows for the submission of long running compute jobs. Think of these machines as a dumping ground for discrete computing tasks that might be rude or disruptive to execute on the main (shared) shell servers (i.e., linux1, linux2, linux3).+Think of these machines as a dumping ground for discrete computing tasks that might be rude or disruptive to execute on the main (shared) shell servers (i.e., linux1, linux2, linux3).
  
 For job submission we will be using a piece of software called [[http://slurm.schedmd.com|SLURM]]. Simply put, SLURM is a queue management system and stands for **S**imple **L**inux **U**tility for **R**esource **M**anagement; it was developed at the Lawrence Livermore National Lab. It currently supports some of the largest compute clusters in the world. The best description of SLURM can be found on its homepage: For job submission we will be using a piece of software called [[http://slurm.schedmd.com|SLURM]]. Simply put, SLURM is a queue management system and stands for **S**imple **L**inux **U**tility for **R**esource **M**anagement; it was developed at the Lawrence Livermore National Lab. It currently supports some of the largest compute clusters in the world. The best description of SLURM can be found on its homepage:
Line 246: Line 245:
  
 ====== Using the GPU ====== ====== Using the GPU ======
-[[ techstaff:cuda | Environment Variables and more ]] 
-===== CUDA_VISIBLE_DEVICES ===== 
-Do not set this variable. It will be set for you by SLURM. 
- 
-The variable name is actually misleading; since it does NOT mean the amount of devices, but rather the physical device number assigned by the kernel (e.g. /dev/nvidia2). 
- 
-For example: If you requested multiple gpu's from SLURM (--gres=gpu:2), the CUDA_VISIBLE_DEVICES variable should contain two numbers(0-3 in this case) separated by a comma (e.g. 1,3). 
- 
  
 ===== GRES Multiple GPU's on one system ===== ===== GRES Multiple GPU's on one system =====
Line 340: Line 331:
  
  
-===== Paths ===== +===== Environment Variables ===== 
-You will need to add the following to your ''%%$PATH%%'' and ''%%$LD_LIBRARY_PATH%%''.+ 
 +==== CUDA_HOME, LD_LIBRARY_PATH ==== 
 + 
 +Please make sure you specify $CUDA_HOME and if you want to take advantage of CUDNN libraries you will need to append /usr/local/cuda-x.x/lib64 to the $LD_LIBRARY_PATH environment variable. 
 + 
 +  cuda_version=9.2 
 +  export CUDA_HOME=/usr/local/cuda-${cuda_version} 
 +  export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64 
 + 
 +Currently we support the same versions of CUDA that the latest version of CUDNN supports. This is not written in stone and we can accommodate most other versions if required; just let techstaff know what your needs are. 
 + 
 +==== PATH ==== 
 +You may also need to add the following to your ''%%$PATH%%''
  
   export PATH=$PATH:/usr/local/cuda/bin   export PATH=$PATH:/usr/local/cuda/bin
-  export LD_LIBRARY_PATH=$LD_LIBRARY_PATH=/usr/local/cuda/lib+ 
 +==== CUDA_VISIBLE_DEVICES ==== 
 +Do not set this variable. It will be set for you by SLURM. 
 + 
 +The variable name is actually misleading; since it does NOT mean the amount of devices, but rather the physical device number assigned by the kernel (e.g. /dev/nvidia2). 
 + 
 +For example: If you requested multiple gpu's from SLURM (--gres=gpu:2), the CUDA_VISIBLE_DEVICES variable should contain two numbers(0-3 in this case) separated by a comma (e.g. 1,3).
  
  
Line 402: Line 411:
 STDERR should be blank. STDERR should be blank.
 ====== More ====== ====== More ======
-If you feel this documentation is lacking in some way please let techstaff know. Email [[techstaff@cs.uchicago.edu]], call (773-702-1031), or stop by our office (Ryerson 154).+If you feel this documentation is lacking in some way please let techstaff know. Email [[techstaff@cs.uchicago.edu]], call (773-702-1031), or stop by our office (Crerar 357).
/var/lib/dokuwiki/data/pages/techstaff/slurm.txt · Last modified: 2021/01/06 16:13 by kauffman

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki