slurm
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| slurm [2022/05/26 14:54] – update example with CUDA version, as 9.2 is not installed on focal tdobes | slurm [2025/06/30 17:59] (current) – amcguire | ||
|---|---|---|---|
| Line 16: | Line 16: | ||
| ==== Discord ==== | ==== Discord ==== | ||
| - | There is a dedicated text channel '' | + | There is a dedicated text channel '' |
| ===== Clusters ===== | ===== Clusters ===== | ||
| Line 27: | Line 27: | ||
| ==== Peanut Cluster ==== | ==== Peanut Cluster ==== | ||
| - | Think of these machines as a dumping ground for discrete computing tasks that might be rude or disruptive to execute on the main (shared) shell servers (i.e., | + | Think of these machines as a dumping ground for discrete computing tasks that might be rude or disruptive to execute on the main (shared) shell servers (i.e., |
| Additionally, | Additionally, | ||
| Line 39: | Line 39: | ||
| ===== Where to begin ===== | ===== Where to begin ===== | ||
| - | Slurm is a set of command line utilities that can be accessed via the command line from **most** any computer science system you can login to. Using our main shell servers (linux.cs.uchicago.edu) is expected to be our most common use case, so you should start there. | + | Slurm is a set of command line utilities that can be accessed via the command line from **most** any computer science system you can login to. Using our main shell servers ('' |
| ssh user@linux.cs.uchicago.edu | ssh user@linux.cs.uchicago.edu | ||
| Line 88: | Line 88: | ||
| === Default Quotas === | === Default Quotas === | ||
| By default we set a job to be run on one CPU and allocate 100MB of RAM. If you require more than that you should specify what you need. Using the following options will do: '' | By default we set a job to be run on one CPU and allocate 100MB of RAM. If you require more than that you should specify what you need. Using the following options will do: '' | ||
| + | |||
| + | === MPI Usage === | ||
| + | The AI cluster supports the use of MPI. The following example illustrates its basic use. | ||
| + | |||
| + | < | ||
| + | amcguire@fe01: | ||
| + | #include < | ||
| + | #include < | ||
| + | #include < | ||
| + | |||
| + | int main(int argc, char **argv) { | ||
| + | // Initialize MPI | ||
| + | MPI_Init(& | ||
| + | |||
| + | // Get the number of processes in the global communicator | ||
| + | int count; | ||
| + | MPI_Comm_size(MPI_COMM_WORLD, | ||
| + | |||
| + | // Get the rank of the current process | ||
| + | int rank; | ||
| + | MPI_Comm_rank(MPI_COMM_WORLD, | ||
| + | |||
| + | // Get the current hostname | ||
| + | char hostname[1024]; | ||
| + | gethostname(hostname, | ||
| + | |||
| + | // Print a hello world message for this rank | ||
| + | printf(" | ||
| + | |||
| + | // Finalize the MPI environment before exiting | ||
| + | MPI_Finalize(); | ||
| + | } | ||
| + | amcguire@fe01: | ||
| + | #!/bin/bash | ||
| + | #SBATCH -J mpi-hello | ||
| + | #SBATCH -n 2 # Number of processes | ||
| + | #SBATCH -t 0: | ||
| + | #SBATCH -o hello-job.out | ||
| + | |||
| + | # Disable the Infiniband transport for OpenMPI (not present on all clusters) | ||
| + | #export OMPI_MCA_btl=" | ||
| + | |||
| + | # Run the job (assumes the batch script is submitted from the same directory) | ||
| + | mpirun -np 2 ./mpi-hello | ||
| + | |||
| + | amcguire@fe01: | ||
| + | amcguire@fe01: | ||
| + | -rwxrwx--- 1 amcguire amcguire 16992 Jun 30 10:49 mpi-hello | ||
| + | amcguire@fe01: | ||
| + | Submitted batch job 1196702 | ||
| + | amcguire@fe01: | ||
| + | Hello from process 0 of 2 on host p001 | ||
| + | Hello from process 1 of 2 on host p002 | ||
| + | </ | ||
| === Exclusive access to a node === | === Exclusive access to a node === | ||
| Line 415: | Line 469: | ||
| STDOUT will look something like this: | STDOUT will look something like this: | ||
| < | < | ||
| - | cnetid@linux1:~$ cat $HOME/ | + | cnetid@focal0:~$ cat $HOME/ |
| Device Number: 0 | Device Number: 0 | ||
| Device name: Tesla M2090 | Device name: Tesla M2090 | ||
/var/lib/dokuwiki/data/attic/slurm.1653594852.txt.gz · Last modified: 2022/05/26 14:54 by tdobes