User Tools

Site Tools


techstaff:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
techstaff:slurm [2018/05/03 09:53] – [Using the GPU] kauffmantechstaff:slurm [2018/05/04 11:59] kauffman
Line 247: Line 247:
  
 ===== GRES Multiple GPU's on one system ===== ===== GRES Multiple GPU's on one system =====
-Jobs will not be allocated any generic resources unless specifically requested at job submit time using the --gres option supported by the salloc, sbatch and srun commands. The option requires an argument specifying which generic resources are required and how many resources. The resource specification is of the form name[:type:count]. The name is the same name as specified by the GresTypes and Gres configuration parameters. type identifies a specific type of that generic resource (e.g. a specific model of GPU). count specifies how many resources are required and has a default value of 1. For example: +GRES: Generic Resource. As of 2018-05-04 these only include GPU's. 
-sbatch --gres=gpu:kepler:2 ....+ 
 +Jobs will not be allocated any generic resources unless specifically requested at job submit time using the ''%%--gres%%'' option supported by the ''%%salloc%%''''%%sbatch%%'' and ''%%srun%%'' commands. The option requires an argument specifying which generic resources are required and how many resources. The resource specification is of the form ''%%name[:type:count]%%''. The name is the same name as specified by the GresTypes and Gres configuration parameters. type identifies a specific type of that generic resource (e.g. a specific model of GPU). count specifies how many resources are required and has a default value of 1. For example: 
 +<code>sbatch --gres=gpu:titan:2 ....</code>
  
 Jobs will be allocated specific generic resources as needed to satisfy the request. If the job is suspended, those resources do not become available for use by other jobs. Jobs will be allocated specific generic resources as needed to satisfy the request. If the job is suspended, those resources do not become available for use by other jobs.
  
-Job steps can be allocated generic resources from those allocated to the job using the --gres option with the srun command as described above. By default, a job step will be allocated all of the generic resources allocated to the job. If desired, the job step may explicitly specify a different generic resource count than the job. This design choice was based upon a scenario where each job executes many job steps. If job steps were granted access to all generic resources by default, some job steps would need to explicitly specify zero generic resource counts, which we considered more confusing. The job step can be allocated specific generic resources and those resources will not be available to other job steps. A simple example is shown below.+Job steps can be allocated generic resources from those allocated to the job using the ''%%--gres%%'' option with the ''%%srun%%'' command as described above. By default, a job step will be allocated all of the generic resources allocated to the job. If desired, the job step may explicitly specify a different generic resource count than the job. This design choice was based upon a scenario where each job executes many job steps. If job steps were granted access to all generic resources by default, some job steps would need to explicitly specify zero generic resource counts, which we considered more confusing. The job step can be allocated specific generic resources and those resources will not be available to other job steps. A simple example is shown below.
  
  
/var/lib/dokuwiki/data/pages/techstaff/slurm.txt · Last modified: 2021/01/06 16:13 by kauffman

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki