techstaff:aicluster-admin
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
techstaff:aicluster-admin [2020/11/30 20:17] – kauffman | techstaff:aicluster-admin [2021/02/23 19:58] (current) – kauffman | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== AI Cluster | + | ====== AI Cluster |
===== TODO ===== | ===== TODO ===== | ||
+ | - There are multiple methods used to calculate priority reflected on the spreadsheet. | ||
+ | - Double check the math for correctness. | ||
+ | - Does the math reflect our intent? I think it does. | ||
+ | - Multiple methods in calculating priority are reflected (blue to purple cells: '' | ||
+ | - Choose one calculation to use. This decision doesn' | ||
+ | - If you have a suggestion: Please show us the work in a new column or by cloning the sheet. | ||
- | Since I'm still working on it, I don't guarantee any uptime yet. Mainly I need to make sure TRES tracking is working like we want. This will involve restarting slurmd | + | ===== Contribution Tracking |
+ | [[https:// | ||
- | * < | + | AI Cluster committee members have access. |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * figure why summary view is no longer a thing. | + | |
- | * < | + | |
- | * < | + | |
- | * < | + | |
- | * home directory | + | |
- | * setup backups for home dirs | + | |
- | * default quota | + | |
- | * home directory usage report | + | |
- | * monitoring | + | |
- | * basic node monitor | + | |
- | * nfs or bandwidth monitoring | + | |
+ | ==== Sheet usage ==== | ||
+ | * Red: Do not edit | ||
+ | * Green: user input (This will be Techstaff 95% of the time) | ||
+ | * `groups` sheet: | ||
+ | * contributions get assigned a POSIX group. Group must have a primary contact, who then gets to set members for that group. | ||
+ | * calculates contribution amount for use in `contrib-priority`. | ||
+ | * tracks group name and primary owner | ||
+ | * `log` sheet | ||
+ | * All contributions will get entered here. | ||
+ | * Hardware contribution gets converted to USD by techstaff. A receipt of the purchase is good starting point. | ||
+ | * The group ' | ||
+ | * `contrib-priority` calculation references contrib amounts calculated in `groups`. | ||
- | ===== Fairshare ===== | ||
- | # Check out the fairshare values | + | ===== Understanding Slurm Fairshare and Priority/ |
+ | Slurm comes with built in tools to calculation fair share priorities without anyone needing to do anything special. The cluster uses a [[https:// | ||
+ | |||
+ | ==== How Slurm calculates Job priority ==== | ||
+ | Generally Slurm will use this formula to determine a jobs priority. | ||
< | < | ||
- | kauffman3@fe01: | + | Job_priority |
- | Account | + | site_factor + |
- | -------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- ------------------------------ ------------------------------ | + | (PriorityWeightAge) * (age_factor) + |
- | kauffman3 | + | (PriorityWeightAssoc) * (assoc_factor) + |
- | kauffman3 | + | (PriorityWeightFairshare) * (fair-share_factor) + |
- | kauffman4 | + | (PriorityWeightJobSize) * (job_size_factor) |
- | kauffman4 | + | (PriorityWeightPartition) * (partition_factor) |
+ | (PriorityWeightQOS) * (QOS_factor) + | ||
+ | SUM(TRES_weight_cpu * TRES_factor_cpu, | ||
+ | TRES_weight_< | ||
+ | | ||
+ | - nice_factor | ||
</ | </ | ||
+ | The factors on the left that start with '' | ||
- | We are using the FairTree (fairshare algorithm). This is the default in Slurm these days and from what I can tell probably better suits our needs. It is no big deal to change | + | < |
+ | fe01:~$ cat / | ||
+ | PriorityType=priority/ | ||
+ | PriorityDecayHalfLife=08: | ||
+ | PriorityMaxAge=5-0 | ||
+ | PriorityWeightFairshare=500000 | ||
+ | PriorityWeightPartition=100000 | ||
+ | PriorityWeightQOS=0 | ||
+ | PriorityWeightJobSize=0 | ||
+ | PriorityWeightAge=0 | ||
+ | PriorityFavorSmall=YES | ||
+ | </ | ||
+ | *Note that this example may not be up to date when you read this. | ||
+ | |||
+ | ===== How we modify job priority to favor contributors ===== | ||
+ | |||
+ | * We adjust those priorities with partitions for those who have donated either monetarily or with hardware. Hardware donations get converted a monetary value when logged on the spreadsheet. | ||
+ | * Every contribution gets assigned a POSIX group, hereon referred | ||
- | As the system exists now. One Account per User. | ||
+ | Here is a version of the partition configuration as it stands now (2021-02-10). | ||
< | < | ||
- | | + | PartitionName=general Nodes=a[001-008] |
- | Member: kauffman | + | # |
- | User: kauffman | + | PartitionName=cdac-contrib Nodes=a[001-008] AllowGroups=cdac Priority=5 |
</ | </ | ||
- | We will probably assign fairshare points to accounts, not users. | ||
- | ====== QOS ====== | + | ^Partition^Description^Priority^ |
+ | |general| For all users| 0 | | ||
+ | |${group}-own | Machines $group has donated. Enabled when asked. | 100 | | ||
+ | |${group}-contrib | A method to give slightly higher job priority to groups who have donated but do not own machines.| Variable based on spreadsheet calculation. | | ||
+ | The key thing to notice before you continue reading is that nodes can be added to multiple partitions. '' | ||
- | When submitting jobs users will have to include ' | ||
- | get the priority levels associated with that account. | ||
- | Priority levels: | + | ==== Calculating -contrib partition usage ==== |
- | normal: [default] value=0 | + | |
- | low: value=100 | + | |
- | medium: value=500 | + | |
- | high: value=1000 | + | |
- | groupname is a Slurm 'account', with users of the cluster | + | We do the following calculation to determine the '' |
- | As an admin the following would be created: | + | < |
+ | partition usage total time in seconds for 30 days | ||
+ | ------------------------------------------------------ = percent used | ||
+ | all partition usage total time in seconds for 30 days | ||
+ | </ | ||
- | # create group and set allowed QOS levels. Multiple levels can be specified. | + | The percent will end up as an integer. |
- | # Meaning you can set ' | + | |
- | # sacctmgr create account jonaslab | + | |
- | # sacctmgr -i modify account jonaslab set qos=low | + | |
- | # sacctmgr -i modify account jonaslab set defaultqos=low | + | |
- | # Now add ' | + | There is a [[https:// |
- | # sacctmgr create user kauffman3 account=jonaslab | + | |
- | These values get used in the multifactor calculation to set the total | + | You'll see on the spreadsheet we take subtract '' |
- | priority on any given job. | + | |
- | The math/ | + | Total amount of money contributed |
- | with something optimal. I've guessed at values that seem reasonable | + | |
- | should do what we want. | + | |
- | https:// | + | |
- | The values on the left side of the + signs are values we can set. | + | |
+ | This calculation will be run once a month and the relevant groups ${group}-contrib priority updated to reflect past months usage. | ||
- | It will be up to us to know when to remove any groups access to higher | ||
- | priorities. I imagine some sort of boolean in a spreadsheet or database. | ||
- | If you do not use the '--account=<groupname>' | + | Note that the term " |
- | default account which has the default priority (normal) | + | |
+ | |||
+ | |||
+ | |||
+ | ====== AI Cluster Admin ====== | ||
+ | |||
+ | ===== TODO ===== | ||
+ | |||
+ | Since I'm still working on it, I don't guarantee any uptime yet. Mainly I need to make sure TRES tracking is working like we want. This will involve restarting slurmd and slurmctld which will kill running jobs. | ||
+ | |||
+ | |||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * home directory | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * < | ||
+ | * monitoring | ||
+ | * < | ||
+ | * nfs or bandwidth monitoring | ||
+ | * gpu | ||
+ | * sync script | ||
+ | * < | ||
- | Anyways... a more readable version of the policy would be helpful for me to | ||
- | try to match what we think we want to what we can do. |
/var/lib/dokuwiki/data/attic/techstaff/aicluster-admin.1606789058.txt.gz · Last modified: 2020/11/30 20:17 by kauffman