-
Notifications
You must be signed in to change notification settings - Fork 345
feat(openai): support token metrics for completions and chat completions when not provided #4366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…r/openai-streamed-token-metrics
…r/openai-streamed-token-metrics
Overall package sizeSelf size: 6.6 MB Dependency sizes
🤖 This report was automatically generated by heaviest-objects-in-the-universe |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #4366 +/- ##
==========================================
- Coverage 85.01% 80.42% -4.59%
==========================================
Files 257 3 -254
Lines 11361 373 -10988
Branches 33 33
==========================================
- Hits 9658 300 -9358
+ Misses 1703 73 -1630 ☔ View full report in Codecov by Sentry. |
BenchmarksBenchmark execution time: 2024-06-03 14:15:02 Comparing candidate commit f9b981d in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from the ml-obs team, but open to the Node JS team's thoughts regarding the performance implications of try/catch for tiktoken.
@@ -15,6 +15,14 @@ const RE_TAB = /\t/g | |||
// TODO: In the future we should refactor config.js to make it requirable | |||
let MAX_TEXT_LEN = 128 | |||
|
|||
let encodingForModel | |||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
APM folks - let me know if you have thoughts on this implementation. explained more in the PR description, but TL;DR in the Python tracer, we do something similar where we try/catch
importing this tiktoken
library. I included it here for consistency, and from customer ask. I'd prefer to have it as an option, but didn't want to explicitly make it an optional dependency for dd-trace-js
, so if there are any objections/other ways of doing this, happy to implement!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is totally fine.
@@ -15,6 +15,14 @@ const RE_TAB = /\t/g | |||
// TODO: In the future we should refactor config.js to make it requirable | |||
let MAX_TEXT_LEN = 128 | |||
|
|||
let encodingForModel | |||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is totally fine.
…ons when not provided (#4366) * usage is included in the body * completion tokens * capture usage from chunk * impl * add tests * simplify count tokens * cleanup, comments
…ons when not provided (#4366) * usage is included in the body * completion tokens * capture usage from chunk * impl * add tests * simplify count tokens * cleanup, comments
…ons when not provided (#4366) * usage is included in the body * completion tokens * capture usage from chunk * impl * add tests * simplify count tokens * cleanup, comments
…ons when not provided (#4366) * usage is included in the body * completion tokens * capture usage from chunk * impl * add tests * simplify count tokens * cleanup, comments
What does this PR do?
Adds support for tagging token metrics and submitting them as metrics when they are not provided. This only happens for
completions
andchat.completions
in the case of streamed responses.This is a bit of a gray area, as although the Node.js OpenAI SDK does not provide these usage counts at all for streamed responses, the OpenAI API itself does, through the following option:
Which, in the Node.js SDK, looks like:
While
stream_options
is not yet a supported option in the Node.js SDK, it is passed through as a part of the request body to the OpenAI API, and is reflected in the returned response for streamed cases, where the last chunk includes the usage.However, in the case users do not want to specify this option all throughout their code, or are uneasy about using an unsupported option for the SDK, this change also introduces computing these token metrics client-side. This is done either by:
tiktoken
if it is installed, a popular library for token estimation for a variety if models. This is done by trying to require it in atry/catch
, something I don't think is common for the Node.js tracer, but is done in the Python tracer, and is implemented here as well for consistency in experience.tiktoken
isn't installed, or if there's an issue with usingtiktoken
, some basic estimations are used. This, again, follows the standard set in the Python tracer, and does not seem to have a noticeable performance impact (at least from tests, but will try to add some benchmarks).Motivation
Expand support for OpenAI integration.
Plugin Checklist
Additional Notes for Reviewers
Big chunk of LOC are because of test fixtures. Additionally, let me know if there are any concerns with optionally requiring
tiktoken
- I can try and use an approach that doesn't rely on atry/catch
block, but instead checks for the dependency in thenode_modules
directly. However, I feel it is important to try and leveragetiktoken
for the sake of consistency between tracers, and for more accurate token counts in the case that they aren't on the returned chunks.