Skip to content

Commit d5ab5dc

Browse files
authored
Merge pull request #3473 from benoitc/feature/per-app-worker-allocation
feat(dirty): add per-app worker allocation for memory optimization
2 parents 1591a6c + d563a7e commit d5ab5dc

File tree

19 files changed

+2202
-33
lines changed

19 files changed

+2202
-33
lines changed

docs/content/2026-news.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,15 @@
1616
- Lifecycle hooks: `on_dirty_starting`, `dirty_post_fork`,
1717
`dirty_worker_init`, `dirty_worker_exit`
1818

19+
- **Per-App Worker Allocation for Dirty Arbiters**: Control how many dirty workers
20+
load each app for memory optimization with heavy models
21+
([PR #3473](https://github.com/benoitc/gunicorn/pull/3473))
22+
- Set `workers` class attribute on DirtyApp (e.g., `workers = 2`)
23+
- Or use config format `module:class:N` (e.g., `myapp:HeavyModel:2`)
24+
- Requests automatically routed to workers with the target app
25+
- New exception `DirtyNoWorkersAvailableError` for graceful error handling
26+
- Example: 8 workers × 10GB model = 80GB → with `workers=2`: 20GB (75% savings)
27+
1928
- **HTTP/2 Support (Beta)**: Native HTTP/2 (RFC 7540) support for improved performance
2029
with modern clients ([PR #3468](https://github.com/benoitc/gunicorn/pull/3468))
2130
- Multiplexed streams over a single connection

docs/content/dirty.md

Lines changed: 139 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -89,8 +89,10 @@ This makes dirty apps ideal for ML inference, where loading a model once and reu
8989
| | | | | |
9090
+---+--------+---+-------+---+
9191
|
92-
All workers load all dirty apps
93-
[MLApp, ImageApp, ...]
92+
Workers load apps based on allocation
93+
Worker 1: [MLApp, ImageApp, HeavyApp]
94+
Worker 2: [MLApp, ImageApp, HeavyApp]
95+
Worker 3: [MLApp, ImageApp] (HeavyApp workers=2)
9496
```
9597

9698
### Process Relationships
@@ -138,6 +140,133 @@ gunicorn myapp:app \
138140
| `dirty_threads` | `1` | Threads per dirty worker |
139141
| `dirty_graceful_timeout` | `30` | Graceful shutdown timeout |
140142

143+
## Per-App Worker Allocation
144+
145+
By default, all dirty workers load all configured apps. For apps that consume significant memory (like large ML models), you can limit how many workers load a specific app.
146+
147+
### Why Per-App Allocation?
148+
149+
Consider a scenario with a 10GB ML model and 8 dirty workers:
150+
151+
- **Default behavior**: 8 workers × 10GB = 80GB RAM
152+
- **With `workers=2`**: 2 workers × 10GB = 20GB RAM (75% savings)
153+
154+
Requests for the limited app are routed only to workers that have it loaded.
155+
156+
### Configuration Methods
157+
158+
**Method 1: Class Attribute**
159+
160+
Set the `workers` attribute on your DirtyApp class:
161+
162+
```python
163+
from gunicorn.dirty import DirtyApp
164+
165+
class HeavyModelApp(DirtyApp):
166+
workers = 2 # Only 2 workers will load this app
167+
168+
def init(self):
169+
self.model = load_10gb_model()
170+
171+
def predict(self, data):
172+
return self.model.predict(data)
173+
174+
def close(self):
175+
pass
176+
```
177+
178+
**Method 2: Config Override**
179+
180+
Use the `module:class:N` format in your config:
181+
182+
```python
183+
# gunicorn.conf.py
184+
dirty_apps = [
185+
"myapp.light:LightApp", # All workers (default)
186+
"myapp.heavy:HeavyModelApp:2", # Only 2 workers
187+
"myapp.single:SingletonApp:1", # Only 1 worker
188+
]
189+
dirty_workers = 4
190+
```
191+
192+
Config overrides take precedence over class attributes.
193+
194+
### Worker Distribution
195+
196+
When workers spawn, apps are assigned based on their limits:
197+
198+
```
199+
Example with dirty_workers=4:
200+
- LightApp (workers=None): Loaded on workers 1, 2, 3, 4
201+
- HeavyModelApp (workers=2): Loaded on workers 1, 2
202+
- SingletonApp (workers=1): Loaded on worker 1
203+
204+
Worker 1: [LightApp, HeavyModelApp, SingletonApp]
205+
Worker 2: [LightApp, HeavyModelApp]
206+
Worker 3: [LightApp]
207+
Worker 4: [LightApp]
208+
```
209+
210+
### Request Routing
211+
212+
Requests are automatically routed to workers that have the target app:
213+
214+
```python
215+
client = get_dirty_client()
216+
217+
# Goes to any of 4 workers (round-robin)
218+
client.execute("myapp.light:LightApp", "action")
219+
220+
# Goes to worker 1 or 2 only (round-robin between those)
221+
client.execute("myapp.heavy:HeavyModelApp", "predict", data)
222+
223+
# Always goes to worker 1
224+
client.execute("myapp.single:SingletonApp", "process")
225+
```
226+
227+
### Error Handling
228+
229+
If no workers have the requested app loaded, a `DirtyNoWorkersAvailableError` is raised:
230+
231+
```python
232+
from gunicorn.dirty import get_dirty_client
233+
from gunicorn.dirty.errors import DirtyNoWorkersAvailableError
234+
235+
def my_view(request):
236+
client = get_dirty_client()
237+
try:
238+
result = client.execute("myapp.heavy:HeavyModelApp", "predict", data)
239+
except DirtyNoWorkersAvailableError as e:
240+
# All workers with this app are down or app not configured
241+
return {"error": "Service temporarily unavailable", "app": e.app_path}
242+
```
243+
244+
### Worker Crash Recovery
245+
246+
When a worker crashes, its replacement gets the **same apps** as the dead worker:
247+
248+
```
249+
Timeline:
250+
t=0: Worker 1 crashes (had HeavyModelApp)
251+
t=1: Arbiter detects crash, queues respawn
252+
t=2: New Worker 5 spawns with same apps as Worker 1
253+
t=3: HeavyModelApp still available on Worker 2 during gap
254+
```
255+
256+
This ensures:
257+
258+
- No memory redistribution on existing workers
259+
- Predictable replacement behavior
260+
- The heavy model is only loaded on the new worker
261+
262+
### Best Practices
263+
264+
1. **Set realistic limits** - Don't set `workers=1` unless truly necessary (single point of failure)
265+
2. **Monitor memory** - Track per-worker memory to tune allocation
266+
3. **Handle unavailability** - Catch `DirtyNoWorkersAvailableError` gracefully
267+
4. **Use class attributes for app-specific limits** - Makes the limit part of the app definition
268+
5. **Use config for deployment-specific overrides** - Different limits for dev vs prod
269+
141270
## Creating a Dirty App
142271

143272
Dirty apps inherit from `DirtyApp` and implement three methods:
@@ -190,8 +319,9 @@ class MLApp(DirtyApp):
190319

191320
### DirtyApp Interface
192321

193-
| Method | Description |
194-
|--------|-------------|
322+
| Method/Attribute | Description |
323+
|------------------|-------------|
324+
| `workers` | Class attribute. Number of workers to load this app (`None` = all workers). |
195325
| `init()` | Called once when dirty worker starts, after instantiation. Load resources here. |
196326
| `__call__(action, *args, **kwargs)` | Handle requests from HTTP workers. |
197327
| `close()` | Called when dirty worker shuts down. Cleanup resources. |
@@ -604,12 +734,13 @@ watch -n 1 'pstree -p $(cat gunicorn.pid)'
604734
The dirty client raises specific exceptions:
605735

606736
```python
607-
from gunicorn.dirty import (
737+
from gunicorn.dirty.errors import (
608738
DirtyError,
609739
DirtyTimeoutError,
610740
DirtyConnectionError,
611741
DirtyAppError,
612742
DirtyAppNotFoundError,
743+
DirtyNoWorkersAvailableError,
613744
)
614745

615746
try:
@@ -620,6 +751,9 @@ except DirtyTimeoutError:
620751
except DirtyAppNotFoundError:
621752
# App not loaded in dirty workers
622753
pass
754+
except DirtyNoWorkersAvailableError as e:
755+
# No workers have this app (all crashed or app limited to 0 workers)
756+
print(f"No workers for app: {e.app_path}")
623757
except DirtyAppError as e:
624758
# Error during app execution
625759
print(f"App error: {e.message}, traceback: {e.traceback}")

docs/content/reference/settings.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -208,24 +208,48 @@ DirtyArbiter and the exiting DirtyWorker.
208208

209209
Dirty applications to load in the dirty worker pool.
210210

211-
A list of application paths in pattern ``$(MODULE_NAME):$(CLASS_NAME)``.
211+
A list of application paths in one of these formats:
212+
213+
- ``$(MODULE_NAME):$(CLASS_NAME)`` - all workers load this app
214+
- ``$(MODULE_NAME):$(CLASS_NAME):$(N)`` - only N workers load this app
215+
212216
Each dirty app must be a class that inherits from ``DirtyApp`` base class
213217
and implements the ``init()``, ``__call__()``, and ``close()`` methods.
214218

215219
Example::
216220

217221
dirty_apps = [
218-
"myapp.ml:MLApp",
219-
"myapp.images:ImageApp",
222+
"myapp.ml:MLApp", # All workers load this
223+
"myapp.images:ImageApp", # All workers load this
224+
"myapp.heavy:HugeModel:2", # Only 2 workers load this
220225
]
221226

227+
The per-app worker limit is useful for memory-intensive applications
228+
like large ML models. Instead of all 8 workers loading a 10GB model
229+
(80GB total), you can limit it to 2 workers (20GB total).
230+
231+
Alternatively, you can set the ``workers`` class attribute on your
232+
DirtyApp subclass::
233+
234+
class HugeModelApp(DirtyApp):
235+
workers = 2 # Only 2 workers load this app
236+
237+
def init(self):
238+
self.model = load_10gb_model()
239+
240+
Note: The config format (``module:Class:N``) takes precedence over
241+
the class attribute if both are specified.
242+
222243
Dirty apps are loaded once when the dirty worker starts and persist
223244
in memory for the lifetime of the worker. This is ideal for loading
224245
ML models, database connection pools, or other stateful resources
225246
that are expensive to initialize.
226247

227248
!!! info "Added in 25.0.0"
228249

250+
!!! info "Changed in 25.1.0"
251+
Added per-app worker allocation via ``:N`` format suffix.
252+
229253
### `dirty_workers`
230254

231255
**Command line:** `--dirty-workers INT`

gunicorn/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# This file is part of gunicorn released under the MIT license.
33
# See the NOTICE for more information.
44

5-
version_info = (24, 1, 1)
5+
version_info = (25, 0, 0)
66
__version__ = ".".join([str(v) for v in version_info])
77
SERVER = "gunicorn"
88
SERVER_SOFTWARE = "%s/%s" % (SERVER, __version__)

gunicorn/config.py

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2885,23 +2885,47 @@ class DirtyApps(Setting):
28852885
desc = """\
28862886
Dirty applications to load in the dirty worker pool.
28872887
2888-
A list of application paths in pattern ``$(MODULE_NAME):$(CLASS_NAME)``.
2888+
A list of application paths in one of these formats:
2889+
2890+
- ``$(MODULE_NAME):$(CLASS_NAME)`` - all workers load this app
2891+
- ``$(MODULE_NAME):$(CLASS_NAME):$(N)`` - only N workers load this app
2892+
28892893
Each dirty app must be a class that inherits from ``DirtyApp`` base class
28902894
and implements the ``init()``, ``__call__()``, and ``close()`` methods.
28912895
28922896
Example::
28932897
28942898
dirty_apps = [
2895-
"myapp.ml:MLApp",
2896-
"myapp.images:ImageApp",
2899+
"myapp.ml:MLApp", # All workers load this
2900+
"myapp.images:ImageApp", # All workers load this
2901+
"myapp.heavy:HugeModel:2", # Only 2 workers load this
28972902
]
28982903
2904+
The per-app worker limit is useful for memory-intensive applications
2905+
like large ML models. Instead of all 8 workers loading a 10GB model
2906+
(80GB total), you can limit it to 2 workers (20GB total).
2907+
2908+
Alternatively, you can set the ``workers`` class attribute on your
2909+
DirtyApp subclass::
2910+
2911+
class HugeModelApp(DirtyApp):
2912+
workers = 2 # Only 2 workers load this app
2913+
2914+
def init(self):
2915+
self.model = load_10gb_model()
2916+
2917+
Note: The config format (``module:Class:N``) takes precedence over
2918+
the class attribute if both are specified.
2919+
28992920
Dirty apps are loaded once when the dirty worker starts and persist
29002921
in memory for the lifetime of the worker. This is ideal for loading
29012922
ML models, database connection pools, or other stateful resources
29022923
that are expensive to initialize.
29032924
29042925
.. versionadded:: 25.0.0
2926+
2927+
.. versionchanged:: 25.1.0
2928+
Added per-app worker allocation via ``:N`` format suffix.
29052929
"""
29062930

29072931

0 commit comments

Comments
 (0)