Skip to content

sbatch: error: Memory specification can not be satisfied. Found 1 host with 8 cores and 0 GB memory under Slurm control. #1517

@chandanbfx

Description

@chandanbfx

Environment:

  • AWS ParallelCluster / CfnCluster version [e.g. aws-parallelcluster-2.5.0]
  • OS: [ ubuntu1804]
  • Scheduler: [e.g. slurm 19.05.3-2]
  • Master instance type: [e.g. md5.2xlarge]
  • Compute instance type: [e.g. md5.2xlarge]

Hi

I tried modifying slurm.conf and added SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory DefMemPerCPU=3000 but the same error persists either from the command line or script file.

from the script: sbatch: error: Memory specification can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration is not available

from the command line: Found 1 host with 8 cores and 0 GB memory under Slurm control.

$scontrol show partition
PartitionName=compute
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
Nodes=dummy-compute[1-19],ip-172-31-83-72
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=160 TotalNodes=20 SelectTypeParameters=NONE
JobDefaults=(null)
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED

parallel cluster configuration file: pcluster_config.txt
slurm configuration file: slurm.conf.txt
slurm controller log: slurmctld.log

I am new to parallel cluster and slurm, any suggestion and help very much appreciated. if any other log or details required in order to understand the issue, I will provide.

pcluster_config.txt
slurm.conf.txt
slurmctld.log

thanks

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions