techstaff:slurm
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Next revisionBoth sides next revision | ||
techstaff:slurm [2018/05/04 11:59] – kauffman | techstaff:slurm [2018/05/04 12:32] – kauffman | ||
---|---|---|---|
Line 255: | Line 255: | ||
Job steps can be allocated generic resources from those allocated to the job using the '' | Job steps can be allocated generic resources from those allocated to the job using the '' | ||
+ | |||
+ | ==== Ok, but I don't want to read the wall of text above ==== | ||
+ | Fine. | ||
+ | |||
+ | The '' | ||
+ | |||
+ | < | ||
+ | --gpu=gpu: | ||
+ | # Please try to limit yourself to one GPU per person. | ||
+ | </ | ||
+ | |||
+ | Example when using tensorflow: | ||
+ | |||
+ | Give the file ' | ||
+ | Depends on: | ||
+ | '' | ||
+ | '' | ||
+ | < | ||
+ | # | ||
+ | from tensorflow.python.client import device_lib | ||
+ | print(device_lib.list_local_devices()) | ||
+ | </ | ||
+ | |||
+ | Here we can see that no GPU was allocated to us because we did not specify the '' | ||
+ | < | ||
+ | kauffman3@bulldozer: | ||
+ | kauffman3@gpu3: | ||
+ | kauffman3@gpu3: | ||
+ | </ | ||
+ | |||
+ | If we request only 1 GPU. | ||
+ | < | ||
+ | kauffman3@bulldozer: | ||
+ | kauffman3@gpu3: | ||
+ | physical_device_desc: | ||
+ | </ | ||
+ | |||
+ | If we request 2 GPUs. | ||
+ | < | ||
+ | kauffman3@bulldozer: | ||
+ | kauffman3@gpu3: | ||
+ | physical_device_desc: | ||
+ | physical_device_desc: | ||
+ | </ | ||
+ | |||
+ | If we request more GPUs then are available. | ||
+ | < | ||
+ | kauffman3@bulldozer: | ||
+ | srun: error: Unable to allocate resources: Requested node configuration is not available | ||
+ | </ | ||
+ | |||
+ | ==== Cool, but how do I know where and what resources are available ==== | ||
+ | Turns out the '' | ||
+ | < | ||
+ | $ sinfo -O partition, | ||
+ | PARTITION | ||
+ | debug* | ||
+ | general | ||
+ | pascal | ||
+ | titan | ||
+ | </ | ||
+ | |||
+ | FEATURES: Is actually just an arbitrary string in the configuration file that defines a node. However, techstaff hopes it actually provides some useful info. | ||
+ | |||
+ | GRES: Don't depend on this being accurate, however it will definitely give you a clue as to how many generic resources are in a partition. | ||
+ | |||
/var/lib/dokuwiki/data/pages/techstaff/slurm.txt · Last modified: 2021/01/06 16:13 by kauffman