-
Notifications
You must be signed in to change notification settings - Fork 492
Description
Performance Issues with Server-side Multi-instance Deployment
Summary
When deploying Claude Agent SDK in a server-side environment with multiple instances, we're experiencing significant performance bottlenecks that impact production use cases.
Issues
1. Slow Initialization for Each Instance
Every time we create a new SDK instance, the initialization process is extremely slow (20-30+ seconds based on related issues #2166, #3044). In a server environment handling multiple concurrent requests, this startup time is prohibitive.
Impact:
- High latency for initial requests
- Poor user experience
- Resource waste waiting for initialization
2. Process Resource Consumption with Multiple Sessions
When managing multiple active sessions simultaneously, the resource consumption (memory, CPU) accumulates significantly, leading to:
- Increased server costs
- Potential memory leaks or performance degradation over time (similar to #10881)
- Difficulty in horizontal scaling
3. No Way to Maintain "Warm" Instances
Currently, there's no official mechanism to:
- Keep SDK processes in a "warm" state between requests
- Reuse initialized instances across different sessions
- Dynamically switch tool/working directories for a running instance
Feature Request: Dynamic Tool Directory Switching
To address these performance issues, we would like to request a feature that allows dynamic switching of tool directories for an already-initialized SDK instance.
Proposed Solution:
# Initialize once (warm instance)
client = ClaudeSDKClient()
# Dynamically switch working directory per session
client.set_tool_directory("/path/to/session1/tools")
response1 = client.send_message("Task for session 1")
# Reuse same instance for different session
client.set_tool_directory("/path/to/session2/tools")
response2 = client.send_message("Task for session 2")Benefits:
- Instance Pooling: Create a pool of pre-warmed SDK instances that can be reused
- Reduced Latency: Eliminate 20-30s initialization delay for each request
- Better Resource Utilization: Maintain fewer processes while handling more sessions
- Improved Scalability: Enable efficient horizontal scaling in server environments
Current Workarounds Attempted
- Creating instances on-demand: Too slow (20-30s per instance)
- Keeping long-lived instances: Resource consumption becomes problematic
- One instance per session: Not scalable for high-concurrency scenarios
Alternative Suggestions
If dynamic directory switching is not feasible, other solutions that would help:
- Significantly faster initialization (< 1s target)
- Built-in instance pooling/warm-up mechanism
- Lightweight "reset" method to reuse instances for different contexts
- Better process lifecycle management APIs
Use Case
Our server handles multiple users concurrently, each potentially starting new coding tasks. We need to:
- Respond quickly to initial requests (< 2s target)
- Maintain reasonable resource usage
- Scale horizontally as user load increases
Currently, the SDK's architecture seems optimized for single-user CLI usage rather than multi-tenant server deployments.
Question for Maintainers
Is server-side multi-instance deployment a supported use case? If so, what are the recommended patterns for:
- Instance lifecycle management
- Resource optimization
- Minimizing initialization overhead
Related Issues
- #2166 - Initialization extremely slow
- #3044 - SDK mode startup time
- #10881 - Performance degradation in long sessions
Would appreciate any guidance or roadmap information on improving server-side deployment patterns. Thank you!