-
Notifications
You must be signed in to change notification settings - Fork 313
Description
Environment:
- AWS ParallelCluster / CfnCluster version [e.g. aws-parallelcluster-2.5.0]
- OS: [ ubuntu1804]
- Scheduler: [e.g. slurm 19.05.3-2]
- Master instance type: [e.g. md5.2xlarge]
- Compute instance type: [e.g. md5.2xlarge]
Hi
I tried modifying slurm.conf and added SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory DefMemPerCPU=3000 but the same error persists either from the command line or script file.
from the script: sbatch: error: Memory specification can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration is not available
from the command line: Found 1 host with 8 cores and 0 GB memory under Slurm control.
$scontrol show partition
PartitionName=compute
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=dummy-compute[1-19],ip-172-31-83-72
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=160 TotalNodes=20 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
parallel cluster configuration file: pcluster_config.txt
slurm configuration file: slurm.conf.txt
slurm controller log: slurmctld.log
I am new to parallel cluster and slurm, any suggestion and help very much appreciated. if any other log or details required in order to understand the issue, I will provide.
pcluster_config.txt
slurm.conf.txt
slurmctld.log
thanks