slurm:ai
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
slurm:ai [2021/01/06 16:10] – kauffman | slurm:ai [2021/06/29 09:27] – [Login] kauffman | ||
---|---|---|---|
Line 5: | Line 5: | ||
Feedback is requested: | Feedback is requested: | ||
- | | + | |
- | Knowledge of how to use Slurm already is preferred at this stage of testing. | ||
- | The information from the older cluster mostly applies and I suggest you read that documentation: | + | The information from the older cluster mostly applies and I suggest you read that documentation: |
====== Infrastructure ====== | ====== Infrastructure ====== | ||
- | Summary of nodes installed on the cluster | + | Summary of nodes installed on the cluster. |
+ | |||
+ | * [[ http:// | ||
+ | * [[ https:// | ||
+ | * Use '' | ||
===== Computer/ | ===== Computer/ | ||
Line 26: | Line 29: | ||
* 384G RAM | * 384G RAM | ||
* 4x Nvidia Quadro RTX 8000 | * 4x Nvidia Quadro RTX 8000 | ||
+ | |||
+ | * 3x nodes | ||
+ | * 2x AMD EPYC 7302 16-Core Processor | ||
+ | * 512G RAM | ||
+ | * 4x Nvidia A40 | ||
+ | * Note that not all nodes are online yet. | ||
* all: | * all: | ||
Line 39: | Line 48: | ||
* / | * / | ||
* We intend to set user quotas, however, there are no quotas right now. | * We intend to set user quotas, however, there are no quotas right now. | ||
- | * / | + | * / |
* Lives on the home directory server. | * Lives on the home directory server. | ||
* Idea would be to create a dataset with a quota for people to use. | * Idea would be to create a dataset with a quota for people to use. | ||
Line 53: | Line 62: | ||
* zfs mirror with previous snapshots of ' | * zfs mirror with previous snapshots of ' | ||
* NOT a backup. | * NOT a backup. | ||
- | * Not enabled yet. | + | |
====== Login ====== | ====== Login ====== | ||
- | There are a set of front end nodes that give you access to the Slurm cluster. You will connect through these nodes and need to be on these nodes to submit jobs to the cluster. | ||
- | ssh cnetid@fe.ai.cs.uchicago.edu | + | Anyone with a CS account is allowed to login. |
- | * Requires | + | There are a set of front end nodes that give you access to the Slurm cluster. You will connect through these nodes and need to be on these nodes to submit jobs to the cluster. |
+ | ssh cnetid@fe.ai.cs.uchicago.edu | ||
==== File Transfer ==== | ==== File Transfer ==== | ||
You will use the FE nodes to transfer your files onto the cluster storage infrastructure. The network connections on those nodes are 2x 10G each. | You will use the FE nodes to transfer your files onto the cluster storage infrastructure. The network connections on those nodes are 2x 10G each. | ||
Line 123: | Line 132: | ||
</ | </ | ||
- | ==== Notes on CUDA_VISIBLE_DEVICES ==== | ||
- | CUDA_VISIBLE_DEVICES: | ||
- | * This variable should NOT be modified. Ever. | ||
- | * Relative means that if you requested one gpu it will show up as 0. Even if all other gpus on the server are being used by others. | ||
- | |||
- | ===== Fairshare/ | ||
- | By default all usage is tracked and charged to a users default account. A fairshare value is computed and used in prioritizing a job on submission. | ||
- | |||
- | Details are being worked out for anyone that donates to the cluster. This will be some sort of tiered system where you get to use a higher priority when you need it. | ||
- | You will need to charge an account on job submission '' | ||
Line 177: | Line 176: | ||
==== Interactive ==== | ==== Interactive ==== | ||
- '' | - '' | ||
- | - '' | + | - '' |
- '' | - '' | ||
- '' | - '' | ||
- '' | - '' | ||
- | - '' | + | - '' |
- Make a new ssh connection with a tunnel to access your notebook | - Make a new ssh connection with a tunnel to access your notebook | ||
- '' | - '' | ||
- This will make an ssh tunnel on your local machine that fowards traffic sent to '' | - This will make an ssh tunnel on your local machine that fowards traffic sent to '' | ||
- Open your local browser and visit: '' | - Open your local browser and visit: '' | ||
+ | |||
+ | |||
+ | ====== Contribution Policy ===== | ||
+ | This section can be ignored by most people. [[techstaff: | ||
/var/lib/dokuwiki/data/pages/slurm/ai.txt · Last modified: 2022/04/04 10:58 by chaochunh