Slurm memory specification - Main Thread

Opening this issue as the main thread to collect slurm memory related info/concern/workarounds.

### Issue
As mentioned by previously opened issues such as https://github.com/aws/aws-parallelcluster/issues/1517 and https://github.com/aws/aws-parallelcluster/issues/1714, [due to changes from slurm](https://github.com/aws/aws-parallelcluster/issues/1714#issuecomment-609392895), nodes for pcluster>=v2.5.0 are not configured with `RealMemory` information.
As a result, ParallelCluster currently does not support scheduling with memory options with slurm.

### Workarounds
For pcluster>=v2.5.0<v2.9.0, workaround outlined [here](https://github.com/aws/aws-parallelcluster/issues/1517#issuecomment-561775124) can be used to configure memory for cluster containing only 1 compute instance type.

For pcluster>=v2.9.0, [multiple queue mode](https://docs.aws.amazon.com/parallelcluster/latest/ug/queue-mode.html) is introduced, and a cluster can now have multiple compute instance types.
Old workaround can still be used for cluster with only 1 compute instance type.
Here are the updated instruction on how to configure memory for multiple instance types in pcluster>=v2.9.0:
* Determine the RealMemory available in the compute instance. We can get this by ssh into an available/online compute node and running `/opt/slurm/sbin/slurmd -C`, we should see something like `RealMemory=<SOME_NUMBER>` in the output.
* Note that since we have multiple compute instance types, we will need to repeat step 1 for every instance type to get `RealMemory` information for each instance type.
* Once we have the `RealMemory` information we need to add this information to the corresponding nodes in each queue/partition. We can do this by modifying the partition configuration file, located at `/opt/slurm/etc/pcluster/slurm_parallelcluster_<PARTITION_NAME>_partition.conf`.
* Append `RealMemory=<CHOSEN_MEMORY>` to  `NodeName=<YOUR_NODE_NAME> ...` entry for each instance type in each queue/partition.
* For example, if I want to configure `RealMemory=60000` for my nodes `queue1-dy-m54xlarge-[1-10]`. I would modify `/opt/slurm/etc/pcluster/slurm_parallelcluster_queue1_partition.conf`, and the modified file should look like:
```
$ cat /opt/slurm/etc/pcluster/slurm_parallelcluster_queue1_partition.conf 
# This file is automatically generated by pcluster

NodeName=queue1-dy-m54xlarge-[1-10] CPUs=16 State=CLOUD Feature=dynamic,m5.4xlarge RealMemory=60000
...
```
* Note that ideally we should just use the `RealMemory` info we got from `/opt/slurm/sbin/slurmd -C`, but `RealMemory` might be different for different machines. If configured `RealMemory` is larger than the actual seen by `/opt/slurm/sbin/slurmd -C` when a new node launches, the node will be placed into `DRAIN` state by slurm automatically. To be safe, we want to round down the value.
* In `/opt/slurm/etc/slurm.conf` change `SelectTypeParameters` from `CR_CPU` to `CR_CPU_Memory`
* [Optional] pcluster's `clustermgtd` process will replace/terminate `DRAINED` nodes automatically, to disable this functionality and avoid nodes getting terminated automatically when setting up memory, add `terminate_drain_nodes = False` to `clustermgtd` configuration file at `/etc/parallelcluster/slurm_plugin/parallelcluster_clustermgtd.conf`. Once setup is finished, we can remove or set `terminate_drain_nodes = True` to restore fully `clustermgtd` functionalities.
* Restart `slurmd` on compute nodes and `slurmctld` on head node, and we should see that memory is configured in `scontrol show nodes`

### Further discussion
We understand that workarounds for this feature maybe difficult to setup manually.
Official support for this feature is not currently planned because there is not a good way to retrieve `RealMemory` from nodes and configure this information prior to launching the cluster. In addition, there is currently no way for slurm to configure this information for nodes automatically.
We will continue to evaluate on ways to add support for this feature.

Thank you!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slurm memory specification - Main Thread #2198

Issue

Workarounds

Further discussion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Slurm memory specification - Main Thread #2198

Description

Issue

Workarounds

Further discussion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions