Skip to content

Commit 852d884

Browse files
committed
Added "Max Submit" column to queue tables
1 parent ec9b033 commit 852d884

File tree

13 files changed

+108
-139
lines changed

13 files changed

+108
-139
lines changed

docs/hpc/3stampede/notices.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Stampede3 User Guide
2-
*Last update: October 20, 2025*
2+
*Last update: November 3, 2025*
33

44
## Notices { #notices }
55

docs/hpc/3stampede/running.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@
77
Stampede3's job scheduler is the Slurm Workload Manager. Slurm commands enable you to submit, manage, monitor, and control your jobs. See the [Job Management](#jobs) section below for further information.
88

99
!!! important
10-
**Queue limits are subject to change without notice.**
11-
TACC Staff will occasionally adjust the QOS settings in order to ensure fair scheduling for the entire user community.
12-
Use TACC's `qlimits` utility to see the latest queue configurations.
10+
**Queue limits are subject to change without notice.**
11+
Frontera admins may occasionally adjust queue <!--the QOS--> settings in order to ensure fair scheduling for the entire user community.
12+
TACC's `qlimits` utility will display the latest queue configurations.
1313

1414
<!--
1515
10/20/2025
@@ -36,15 +36,15 @@ spr:2.0
3636

3737
#### Table 8. Production Queues { #table8 }
3838

39-
Queue Name | Node Type | Max Nodes per Job<br>(assoc'd cores) | Max Job<br>Duration | Max Nodes<br>per User | Max Jobs<br>per User | Charge Rate<br>(per node-hour)
40-
-- | -- | -- | -- | -- | -- |--
41-
h100 | H100 | 4 nodes<br>(384 cores) | 48 hrs | 4 | 2 | 4 SUs
42-
icx | ICX | 32 nodes<br>(2560 cores) | 48 hrs | 48 | 12 | 1.5 SUs
43-
nvdimm | ICX | 1 node<br>(80 cores) | 48 hrs | 1 | 2 | 4 SUs
44-
pvc | PVC | 4 nodes<br>(384 cores) | 48 hrs | 4 | 2 | 3 SUs
45-
skx | SKX | 256 nodes<br>(12288 cores) | 48 hrs | 256 | 40 | 1 SU
46-
skx-dev | SKX | 16 nodes<br>(768 cores) | 2 hrs | 16 | 2 | 1 SU
47-
spr | SPR | 32 nodes<br>(3584 cores) | 48 hrs | 40 | 24 | 2 SUs
39+
Queue Name | Node Type | Max Nodes per Job<br>(assoc'd cores) | Max Job<br>Duration | Max Nodes<br>per User | Max Jobs<br>per User | Max Submit | Charge Rate<br>(per node-hour)
40+
-- | -- | -- | -- | -- | -- |-- |--
41+
h100 | H100 | 4 nodes<br>(384 cores) | 48 hrs | 4 | 2 | 4 | 4 SUs
42+
icx | ICX | 32 nodes<br>(2560 cores) | 48 hrs | 48 | 12 | 20 | 1.5 SUs
43+
nvdimm | ICX | 1 node<br>(80 cores) | 48 hrs | 1 | 2 | 4 | 4 SUs
44+
pvc | PVC | 4 nodes<br>(384 cores) | 48 hrs | 4 | 2 | 4 | 3 SUs
45+
skx | SKX | 256 nodes<br>(12288 cores) | 48 hrs | 256 | 40 | 60 | 1 SU
46+
skx-dev | SKX | 16 nodes<br>(768 cores) | 2 hrs | 16 | 2 | 4 | 1 SU
47+
spr | SPR | 32 nodes<br>(3584 cores) | 48 hrs | 40 | 24 | 36 | 2 SUs
4848

4949

5050
### Submitting Batch Jobs with `sbatch` { #running-sbatch }

docs/hpc/6lonestar/notices.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Lonestar6 User Guide
2-
*Last update: October 20, 2025*
2+
*Last update: November 3, 2025*
33

44
<!-- ## Notices { #notices } -->
55

docs/hpc/6lonestar/running.md

Lines changed: 13 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,9 @@ The jobs in this queue consume 1/7 the resources of a full node. Jobs are charg
1414
#### Table 5. Production Queues { #table5 }
1515

1616
!!! important
17-
**Queue limits are subject to change without notice.**
18-
TACC Staff will occasionally adjust the QOS settings in order to ensure fair scheduling for the entire user community.
19-
Use TACC's `qlimits` utility to see the latest queue configurations.
17+
**Queue limits are subject to change without notice.**
18+
Frontera admins may occasionally adjust queue <!--the QOS--> settings in order to ensure fair scheduling for the entire user community.
19+
TACC's `qlimits` utility will display the latest queue configurations.
2020

2121
<!--
2222
10/20/2025
@@ -29,8 +29,6 @@ gpu-a100 1 8 2-00:00:00 12 8 32
2929
gpu-a100-dev 1 2 02:00:00 2 1 3
3030
gpu-a100-small 1 1 2-00:00:00 3 3 12
3131
gpu-h100 1 1 2-00:00:00 1 1 4
32-
grace 1 64 2-00:00:00 75 20 100
33-
grace-serial 1 64 5-00:00:00 75 20 80
3432
large 65 256 2-00:00:00 256 1 4
3533
normal 1 64 2-00:00:00 75 20 100
3634
vm-small 1 1 2-00:00:00 4 4 16
@@ -45,19 +43,16 @@ large:1.0
4543
normal:1.0
4644
vm-small:0.143
4745
-->
48-
49-
50-
51-
Queue Name | Min/Max Nodes per Job<br>(assoc'd cores)&#42; | Max Job<br>Duration | Max Nodes<br>per User | Max Jobs<br>per User | Charge Rate<br>(per node-hour)
52-
--- | --- | --- | --- | --- | ---
53-
<code>development</code> | 8 nodes<br>(1024 cores) | 2 hours | 8 | 1 | 1 SU
54-
<code>gpu-a100</code> | 8 nodes<br>(1024 cores) | 48 hours | 12 | 8 | 3 SUs
55-
<code>gpu-a100-dev</code> | 2 nodes<br>(256 cores) | 2 hours | 2 | 1 | 3 SUs
56-
<code>gpu-a100-small</code><sup>&#42;&#42;</sup> | 1 node | 48 hours | 3 | 3 | 1.5 SUs
57-
<code>gpu-h100</code> | 1 node | 48 hours | 1 | 1 | 6 SUs | (96 cores)
58-
<code>large</code><sup>&#42;</sup> | 65/256 nodes<br>(65536 cores) | 48 hours | 256 | 1 | 1 SU
59-
<code>normal</code> | 1/64 nodes<br>(8192 cores) | 48 hours | 75 | 20 | 1 SU
60-
<code>vm-small</code><sup>&#42;&#42;</sup> | 1 node<br>(16 cores) | 48 hours | 4 | 4 | 0.143 SU
46+
Queue Name | Min/Max Nodes per Job<br>(assoc'd cores)&#42; | Max Job<br>Duration | Max Nodes<br>per User | Max Jobs<br>per User | | Max Submit | Charge Rate<br>(per node-hour)
47+
--- | --- | --- | --- | --- | --- | ---
48+
<code>development</code> | 8 nodes<br>(1024 cores) | 2 hours | 8 | 1 | 3 | 1 SU
49+
<code>gpu-a100</code> | 8 nodes<br>(1024 cores) | 48 hours | 12 | 8 | 32 | 3 SUs
50+
<code>gpu-a100-dev</code> | 2 nodes<br>(256 cores) | 2 hours | 2 | 1 | 3 | 3 SUs
51+
<code>gpu-a100-small</code><sup>&#42;&#42;</sup> | 1 node | 48 hours | 3 | 3 | 12 | 1.5 SUs
52+
<code>gpu-h100</code> | 1 node | 48 hours | 1 | 1 | 4 | 6 SUs | (96 cores)
53+
<code>large</code><sup>&#42;</sup> | 65/256 nodes<br>(65536 cores) | 48 hours | 256 | 1 | 4 | 1 SU
54+
<code>normal</code> | 1/64 nodes<br>(8192 cores) | 48 hours | 75 | 20 | 100 | 1 SU
55+
<code>vm-small</code><sup>&#42;&#42;</sup> | 1 node<br>(16 cores) | 48 hours | 4 | 4 | 16 | 0.143 SU
6156

6257

6358
&#42; Access to the `large` queue is restricted. To request more nodes than are available in the normal queue, submit a consulting (help desk) ticket through the TACC User Portal. Include in your request reasonable evidence of your readiness to run under the conditions you're requesting. In most cases this should include your own strong or weak scaling results from Lonestar6.

docs/hpc/frontera.md

Lines changed: 18 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Frontera User Guide
2-
*Last update: October 22, 2025*
2+
*Last update: November 3, 2025*
33

44
<!-- **Important**: (10-15-2024) Please note [TACC's new SU charge policy](#sunotice). -->
55

@@ -732,16 +732,18 @@ Frontera's `flex` queue offers users a low cost queue for lower priority/node co
732732

733733
!!! important
734734
**Queue limits are subject to change without notice.**
735-
Frontera admins may occasionally adjust the QOS settings in order to ensure fair scheduling for the entire user community.
736-
Use TACC's `qlimits` utility to see the latest queue configurations.
737-
738-
Users are limited to a maximum of 50 running and 200 pending jobs in all queues at one time.
735+
Frontera admins may occasionally adjust queue <!--the QOS--> settings in order to ensure fair scheduling for the entire user community.
736+
TACC's `qlimits` utility will display the latest queue configurations.
739737

740738
<!--
741739
10/20/2025
740+
/usr/local/etc/queue.map
742741
frontera4(1)$ qlimits
743742
Current queue/partition limits on TACC's Frontera system:
744743
744+
The "running job limit" is the MaxJobsPU column. MaxJobsPU is the maximum number of jobs a user can have running simultaneously.
745+
The "job submission limit" is the MaxSubmit column. The MaxSubmit limit is the maximum number of jobs a user can have in the queue.
746+
745747
Name MinNode MaxNode PreemptExemptTime MaxWall MaxNodePU MaxJobsPU MaxSubmit
746748
flex 1 128 01:00:00 2-00:00:00 2048 15 60
747749
development 40 02:00:00 40 2 2
@@ -755,31 +757,19 @@ Current queue/partition limits on TACC's Frontera system:
755757
grace 30 5-00:00:00 30 30 100
756758
corral 512 2-00:00:00 2048 100 200
757759
gh 1 02:00:00 1 2 2
758-
759-
/usr/local/etc/queue.map
760-
761-
flex:0.8
762-
development:1.0
763-
normal:1.0
764-
large:1.0
765-
rtx:3.0
766-
rtx-dev:3.0
767-
nvdimm:2.0
768-
small:1.0
769-
rtx-corralextra:3.0
770-
gh:0.0
771760
-->
772761

773-
| Queue Name | Min-Max Nodes per Job<br>(assoc'd cores) | Pre-empt<br>Exempt Time | Max Job Duration | Max Nodes per User | Max Jobs per User | Charge Rate<br>per node-hour
774-
| ------ | ----- | ---- | ---- | ---- | ---- | ----
775-
| <code>flex&#42;</code> | 1-128 nodes<br>(7,168 cores) | 1 hour | 48 hrs | 6400 nodes | 15 | .8 Service Units (SUs)
776-
| <code>development</code> | 1-40 nodes<br>(2,240 cores) | N/A | 2 hrs | 40 nodes | 1 | 1 SU
777-
| <code>normal</code> | 3-512 nodes<br>(28,672 cores) | N/A | 48 hrs | 1024 nodes | 75 | 1 SU
778-
| <code>large&#42;&#42;</code> | 513-2048 nodes<br>(114,688 cores) | N/A | 48 hrs | 3072 nodes | 1 | 1 SU
779-
| <code>rtx</code> | 16 nodes | N/A | 48 hrs | 32 nodes | 12 | 3 SUs
780-
| <code>rtx-dev</code> | 2 nodes | N/A | 2 hrs | 2 nodes | 1 | 3 SUs
781-
| <code>nvdimm</code> | 4 nodes | N/A | 48 hrs | 6 nodes | 3 | 2 SUs
782-
| <code>small</code> | 1-2 nodes | N/A | 48 hrs | 25 nodes | 15 | 1 SU
762+
| Queue Name | Min-Max Nodes per Job<br>(assoc'd cores) | Pre-empt<br>Exempt Time | Max Job Duration | Max Nodes per User | Max Jobs per User | Max Submit | Charge Rate<br>per node-hour
763+
| ------ | ----- | ---- | ---- | ---- | ---- | | ----
764+
| <code>flex&#42;</code> | 1-128 nodes<br>(7,168 cores) | 1 hour | 48 hrs | 6400 nodes | 15 | 60 | .8 Service Units (SUs)
765+
| <code>development</code> | 1-40 nodes<br>(2,240 cores) | N/A | 2 hrs | 40 nodes | 1 | 2 | 1 SU
766+
| <code>normal</code> | 3-512 nodes<br>(28,672 cores) | N/A | 48 hrs | 1024 nodes | 75 | 200 | 1 SU
767+
| <code>large&#42;&#42;</code> | 513-2048 nodes<br>(114,688 cores) | N/A | 48 hrs | 3072 nodes | 1 | 8 | 1 SU
768+
| <code>rtx</code> | 16 nodes | N/A | 48 hrs | 32 nodes | 12 | 36 | 3 SUs
769+
| <code>rtx-dev</code> | 2 nodes | N/A | 2 hrs | 2 nodes | 1 | 2 | 3 SUs
770+
| <code>nvdimm</code> | 4 nodes | N/A | 48 hrs | 6 nodes | 3 | 8 | 2 SUs
771+
| <code>small</code> | 1-2 nodes | N/A | 48 hrs | 25 nodes | 15 | 80 | 1 SU
772+
783773

784774

785775
&#42; **Jobs in the `flex` queue are charged less than jobs in other queues but are eligible for preemption after running for more than one hour.**

docs/hpc/frontera/notices.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Frontera User Guide
2-
*Last update: October 22, 2025*
2+
*Last update: November 3, 2025*
33

44
<!-- **Important**: (10-15-2024) Please note [TACC's new SU charge policy](#sunotice). -->
55

docs/hpc/frontera/running.md

Lines changed: 17 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -30,16 +30,18 @@ Frontera's `flex` queue offers users a low cost queue for lower priority/node co
3030

3131
!!! important
3232
**Queue limits are subject to change without notice.**
33-
Frontera admins may occasionally adjust the QOS settings in order to ensure fair scheduling for the entire user community.
34-
Use TACC's `qlimits` utility to see the latest queue configurations.
35-
36-
Users are limited to a maximum of 50 running and 200 pending jobs in all queues at one time.
33+
Frontera admins may occasionally adjust queue <!--the QOS--> settings in order to ensure fair scheduling for the entire user community.
34+
TACC's `qlimits` utility will display the latest queue configurations.
3735

3836
<!--
3937
10/20/2025
38+
/usr/local/etc/queue.map
4039
frontera4(1)$ qlimits
4140
Current queue/partition limits on TACC's Frontera system:
4241
42+
The "running job limit" is the MaxJobsPU column. MaxJobsPU is the maximum number of jobs a user can have running simultaneously.
43+
The "job submission limit" is the MaxSubmit column. The MaxSubmit limit is the maximum number of jobs a user can have in the queue.
44+
4345
Name MinNode MaxNode PreemptExemptTime MaxWall MaxNodePU MaxJobsPU MaxSubmit
4446
flex 1 128 01:00:00 2-00:00:00 2048 15 60
4547
development 40 02:00:00 40 2 2
@@ -53,31 +55,19 @@ Current queue/partition limits on TACC's Frontera system:
5355
grace 30 5-00:00:00 30 30 100
5456
corral 512 2-00:00:00 2048 100 200
5557
gh 1 02:00:00 1 2 2
56-
57-
/usr/local/etc/queue.map
58-
59-
flex:0.8
60-
development:1.0
61-
normal:1.0
62-
large:1.0
63-
rtx:3.0
64-
rtx-dev:3.0
65-
nvdimm:2.0
66-
small:1.0
67-
rtx-corralextra:3.0
68-
gh:0.0
6958
-->
7059

71-
| Queue Name | Min-Max Nodes per Job<br>(assoc'd cores) | Pre-empt<br>Exempt Time | Max Job Duration | Max Nodes per User | Max Jobs per User | Charge Rate<br>per node-hour
72-
| ------ | ----- | ---- | ---- | ---- | ---- | ----
73-
| <code>flex&#42;</code> | 1-128 nodes<br>(7,168 cores) | 1 hour | 48 hrs | 6400 nodes | 15 | .8 Service Units (SUs)
74-
| <code>development</code> | 1-40 nodes<br>(2,240 cores) | N/A | 2 hrs | 40 nodes | 1 | 1 SU
75-
| <code>normal</code> | 3-512 nodes<br>(28,672 cores) | N/A | 48 hrs | 1024 nodes | 75 | 1 SU
76-
| <code>large&#42;&#42;</code> | 513-2048 nodes<br>(114,688 cores) | N/A | 48 hrs | 3072 nodes | 1 | 1 SU
77-
| <code>rtx</code> | 16 nodes | N/A | 48 hrs | 32 nodes | 12 | 3 SUs
78-
| <code>rtx-dev</code> | 2 nodes | N/A | 2 hrs | 2 nodes | 1 | 3 SUs
79-
| <code>nvdimm</code> | 4 nodes | N/A | 48 hrs | 6 nodes | 3 | 2 SUs
80-
| <code>small</code> | 1-2 nodes | N/A | 48 hrs | 25 nodes | 15 | 1 SU
60+
| Queue Name | Min-Max Nodes per Job<br>(assoc'd cores) | Pre-empt<br>Exempt Time | Max Job Duration | Max Nodes per User | Max Jobs per User | Max Submit | Charge Rate<br>per node-hour
61+
| ------ | ----- | ---- | ---- | ---- | ---- | | ----
62+
| <code>flex&#42;</code> | 1-128 nodes<br>(7,168 cores) | 1 hour | 48 hrs | 6400 nodes | 15 | 60 | .8 Service Units (SUs)
63+
| <code>development</code> | 1-40 nodes<br>(2,240 cores) | N/A | 2 hrs | 40 nodes | 1 | 2 | 1 SU
64+
| <code>normal</code> | 3-512 nodes<br>(28,672 cores) | N/A | 48 hrs | 1024 nodes | 75 | 200 | 1 SU
65+
| <code>large&#42;&#42;</code> | 513-2048 nodes<br>(114,688 cores) | N/A | 48 hrs | 3072 nodes | 1 | 8 | 1 SU
66+
| <code>rtx</code> | 16 nodes | N/A | 48 hrs | 32 nodes | 12 | 36 | 3 SUs
67+
| <code>rtx-dev</code> | 2 nodes | N/A | 2 hrs | 2 nodes | 1 | 2 | 3 SUs
68+
| <code>nvdimm</code> | 4 nodes | N/A | 48 hrs | 6 nodes | 3 | 8 | 2 SUs
69+
| <code>small</code> | 1-2 nodes | N/A | 48 hrs | 25 nodes | 15 | 80 | 1 SU
70+
8171

8272

8373
&#42; **Jobs in the `flex` queue are charged less than jobs in other queues but are eligible for preemption after running for more than one hour.**

0 commit comments

Comments
 (0)