@@ -89,8 +89,10 @@ This makes dirty apps ideal for ML inference, where loading a model once and reu
8989 | | | | | |
9090 +---+--------+---+-------+---+
9191 |
92- All workers load all dirty apps
93- [MLApp, ImageApp, ...]
92+ Workers load apps based on allocation
93+ Worker 1: [MLApp, ImageApp, HeavyApp]
94+ Worker 2: [MLApp, ImageApp, HeavyApp]
95+ Worker 3: [MLApp, ImageApp] (HeavyApp workers=2)
9496```
9597
9698### Process Relationships
@@ -138,6 +140,133 @@ gunicorn myapp:app \
138140| ` dirty_threads ` | ` 1 ` | Threads per dirty worker |
139141| ` dirty_graceful_timeout ` | ` 30 ` | Graceful shutdown timeout |
140142
143+ ## Per-App Worker Allocation
144+
145+ By default, all dirty workers load all configured apps. For apps that consume significant memory (like large ML models), you can limit how many workers load a specific app.
146+
147+ ### Why Per-App Allocation?
148+
149+ Consider a scenario with a 10GB ML model and 8 dirty workers:
150+
151+ - ** Default behavior** : 8 workers × 10GB = 80GB RAM
152+ - ** With ` workers=2 ` ** : 2 workers × 10GB = 20GB RAM (75% savings)
153+
154+ Requests for the limited app are routed only to workers that have it loaded.
155+
156+ ### Configuration Methods
157+
158+ ** Method 1: Class Attribute**
159+
160+ Set the ` workers ` attribute on your DirtyApp class:
161+
162+ ``` python
163+ from gunicorn.dirty import DirtyApp
164+
165+ class HeavyModelApp (DirtyApp ):
166+ workers = 2 # Only 2 workers will load this app
167+
168+ def init (self ):
169+ self .model = load_10gb_model()
170+
171+ def predict (self , data ):
172+ return self .model.predict(data)
173+
174+ def close (self ):
175+ pass
176+ ```
177+
178+ ** Method 2: Config Override**
179+
180+ Use the ` module:class:N ` format in your config:
181+
182+ ``` python
183+ # gunicorn.conf.py
184+ dirty_apps = [
185+ " myapp.light:LightApp" , # All workers (default)
186+ " myapp.heavy:HeavyModelApp:2" , # Only 2 workers
187+ " myapp.single:SingletonApp:1" , # Only 1 worker
188+ ]
189+ dirty_workers = 4
190+ ```
191+
192+ Config overrides take precedence over class attributes.
193+
194+ ### Worker Distribution
195+
196+ When workers spawn, apps are assigned based on their limits:
197+
198+ ```
199+ Example with dirty_workers=4:
200+ - LightApp (workers=None): Loaded on workers 1, 2, 3, 4
201+ - HeavyModelApp (workers=2): Loaded on workers 1, 2
202+ - SingletonApp (workers=1): Loaded on worker 1
203+
204+ Worker 1: [LightApp, HeavyModelApp, SingletonApp]
205+ Worker 2: [LightApp, HeavyModelApp]
206+ Worker 3: [LightApp]
207+ Worker 4: [LightApp]
208+ ```
209+
210+ ### Request Routing
211+
212+ Requests are automatically routed to workers that have the target app:
213+
214+ ``` python
215+ client = get_dirty_client()
216+
217+ # Goes to any of 4 workers (round-robin)
218+ client.execute(" myapp.light:LightApp" , " action" )
219+
220+ # Goes to worker 1 or 2 only (round-robin between those)
221+ client.execute(" myapp.heavy:HeavyModelApp" , " predict" , data)
222+
223+ # Always goes to worker 1
224+ client.execute(" myapp.single:SingletonApp" , " process" )
225+ ```
226+
227+ ### Error Handling
228+
229+ If no workers have the requested app loaded, a ` DirtyNoWorkersAvailableError ` is raised:
230+
231+ ``` python
232+ from gunicorn.dirty import get_dirty_client
233+ from gunicorn.dirty.errors import DirtyNoWorkersAvailableError
234+
235+ def my_view (request ):
236+ client = get_dirty_client()
237+ try :
238+ result = client.execute(" myapp.heavy:HeavyModelApp" , " predict" , data)
239+ except DirtyNoWorkersAvailableError as e:
240+ # All workers with this app are down or app not configured
241+ return {" error" : " Service temporarily unavailable" , " app" : e.app_path}
242+ ```
243+
244+ ### Worker Crash Recovery
245+
246+ When a worker crashes, its replacement gets the ** same apps** as the dead worker:
247+
248+ ```
249+ Timeline:
250+ t=0: Worker 1 crashes (had HeavyModelApp)
251+ t=1: Arbiter detects crash, queues respawn
252+ t=2: New Worker 5 spawns with same apps as Worker 1
253+ t=3: HeavyModelApp still available on Worker 2 during gap
254+ ```
255+
256+ This ensures:
257+
258+ - No memory redistribution on existing workers
259+ - Predictable replacement behavior
260+ - The heavy model is only loaded on the new worker
261+
262+ ### Best Practices
263+
264+ 1 . ** Set realistic limits** - Don't set ` workers=1 ` unless truly necessary (single point of failure)
265+ 2 . ** Monitor memory** - Track per-worker memory to tune allocation
266+ 3 . ** Handle unavailability** - Catch ` DirtyNoWorkersAvailableError ` gracefully
267+ 4 . ** Use class attributes for app-specific limits** - Makes the limit part of the app definition
268+ 5 . ** Use config for deployment-specific overrides** - Different limits for dev vs prod
269+
141270## Creating a Dirty App
142271
143272Dirty apps inherit from ` DirtyApp ` and implement three methods:
@@ -190,8 +319,9 @@ class MLApp(DirtyApp):
190319
191320### DirtyApp Interface
192321
193- | Method | Description |
194- | --------| -------------|
322+ | Method/Attribute | Description |
323+ | ------------------| -------------|
324+ | ` workers ` | Class attribute. Number of workers to load this app (` None ` = all workers). |
195325| ` init() ` | Called once when dirty worker starts, after instantiation. Load resources here. |
196326| ` __call__(action, *args, **kwargs) ` | Handle requests from HTTP workers. |
197327| ` close() ` | Called when dirty worker shuts down. Cleanup resources. |
@@ -604,12 +734,13 @@ watch -n 1 'pstree -p $(cat gunicorn.pid)'
604734The dirty client raises specific exceptions:
605735
606736``` python
607- from gunicorn.dirty import (
737+ from gunicorn.dirty.errors import (
608738 DirtyError,
609739 DirtyTimeoutError,
610740 DirtyConnectionError,
611741 DirtyAppError,
612742 DirtyAppNotFoundError,
743+ DirtyNoWorkersAvailableError,
613744)
614745
615746try :
@@ -620,6 +751,9 @@ except DirtyTimeoutError:
620751except DirtyAppNotFoundError:
621752 # App not loaded in dirty workers
622753 pass
754+ except DirtyNoWorkersAvailableError as e:
755+ # No workers have this app (all crashed or app limited to 0 workers)
756+ print (f " No workers for app: { e.app_path} " )
623757except DirtyAppError as e:
624758 # Error during app execution
625759 print (f " App error: { e.message} , traceback: { e.traceback} " )
0 commit comments