-
Notifications
You must be signed in to change notification settings - Fork 939
Python: latency improvements #3014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces performance improvements to the Python Agent Framework based on profiling analysis. The changes focus on reducing computational overhead through strategic optimizations.
Key Changes
- Caching of JSON schema generation from Pydantic models to avoid repeated expensive serialization
- Replacing
isinstance()checks with faster string attribute comparisons for content type identification - Reusing OpenTelemetry message representations for logging to eliminate redundant
to_dict()calls
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| python/packages/core/agent_framework/observability.py | Optimizes message logging by reusing the OTEL message representation instead of calling to_dict() separately |
| python/packages/core/agent_framework/_types.py | Improves content type checking performance by using type attribute comparison before falling back to isinstance() |
| python/packages/core/agent_framework/_tools.py | Implements caching for model_json_schema() results and updates exclusion list for proper serialization |
| except AdditionItemMismatch: | ||
| # Use type attribute check first (fast string comparison) before isinstance (slower) | ||
| content_type = getattr(content, "type", None) | ||
| if content_type == "function_call": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use enums at least?
TaoChenOSU
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please also share the profiling results
9e97938 to
f15dcdd
Compare
Motivation and Context
Used a profiling script and
cProfileto figure out the biggest time losses in AF, led to a few improvementstypefields instead ofisinstanceto_dictcallDescription
Benchmark Comparison: Old vs New (Optimized) Code
import_agent_frameworkimport_azure_openai_chat_clientimport_openai_chat_clientimport_chat_agentimport_ai_functionchat_client_streaming=False_function_call=Falsechat_client_streaming=True_function_call=Falsechat_client_streaming=False_function_call=Truechat_client_streaming=True_function_call=Truechat_client_streaming=False_function_call=Falsechat_client_streaming=True_function_call=Falsechat_client_streaming=False_function_call=Truechat_client_streaming=True_function_call=Trueagent_streaming=False_function_call=Falseagent_streaming=True_function_call=Falseagent_streaming=False_function_call=Trueagent_streaming=True_function_call=Trueagent_streaming=False_function_call=Falseagent_streaming=True_function_call=Falseagent_streaming=False_function_call=Trueagent_streaming=True_function_call=TrueKey Findings
🎯 Biggest Wins (with function calls + observability):
📊 Observability Overhead Reduced:
⚡ Summary:
agent_streaming=True_function_call=Falseis slightly slower (likely measurement variance)The optimizations primarily benefit the hot paths: function calling and observability tracing - which are the most common real-world usage patterns!
Contribution Checklist