fix: make usage chunk in stream mode of gemini compatible with openai #1503

hustxiayang · 2025-11-06T14:21:05Z

Description

Users found the simply use "usgae" information does not work for streaming responses of gemini models.

This is because for openai models, the usage chunk would be a separate chunk. For example, this is an example response from gpt-4o:

...

chunk=ChatCompletionChunk(id='chatcmpl-CYv6DPkWfT1xrsS2ySOoRztQKnZDg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None, content_filter_result={'error': {'code': 'content_filter_error', 'message': 'The contents are not filtered'}})], created=1762438677, model='azure.gpt-4o', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_4a331a0222', usage=None, obfuscation='2xP')


chunk=ChatCompletionChunk(id='chatcmpl-CYv6DPkWfT1xrsS2ySOoRztQKnZDg', choices=[], created=1762438677, model='azure.gpt-4o', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_4a331a0222', usage=CompletionUsage(completion_tokens=17, prompt_tokens=12, total_tokens=29, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)), obfuscation='xY40HsJr')

There is a finish_reason chunk, and then a usage chunk.

Thus, want to make it compatible with Openai. (Actually, in anthropic translation, it's already compatible)

Signed-off-by: yxia216 <yxia216@bloomberg.net>

…envoyproxy#1494) **Description** - ai gateway mutating webhook should default failurePolicy to Fail **Related Issues/PRs (if applicable)** fixes: envoyproxy#1493 **Special notes for reviewers (if applicable)** Signed-off-by: googs1025 <googs1025@gmail.com> Signed-off-by: yxia216 <yxia216@bloomberg.net>

**Description** fix: envoyproxy#1485 **Related Issues/PRs (if applicable)** **Special notes for reviewers (if applicable)** Signed-off-by: googs1025 <googs1025@gmail.com> Signed-off-by: yxia216 <yxia216@bloomberg.net>

… a tool call (envoyproxy#1486) **Description** Finish reason should be tool calls if the model returns a tool call response. In vertex api, there is no tool call finish reason, thus need a work around to make it compatible. --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: yxia216 <yxia216@bloomberg.net>

…oxy#1491) **Description** This decouples backendauth & headermutator packages from extproc specifics. As we are looking to migrate to dynamic modules, this is a necessary refactoring work to make the code as reusable as possible. **Related Issues/PRs (if applicable)** Preliminary for envoyproxy#90 --------- Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com> Signed-off-by: yxia216 <yxia216@bloomberg.net>

codecov-commenter · 2025-11-06T14:32:30Z

Codecov Report

❌ Patch coverage is 80.95238% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.89%. Comparing base (a0e4c0e) to head (66eaa8c).

Files with missing lines	Patch %	Lines
internal/extproc/translator/openai_gcpvertexai.go	80.95%	2 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1503      +/-   ##
==========================================
- Coverage   83.91%   83.89%   -0.03%     
==========================================
  Files         144      144              
  Lines       12659    12670      +11     
==========================================
+ Hits        10623    10629       +6     
- Misses       1419     1423       +4     
- Partials      617      618       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang · 2025-11-06T15:43:43Z

/retest

nacx · 2025-11-11T15:40:52Z

@yuzisun can you take a look at this one?

hustxiayang requested a review from a team as a code owner November 6, 2025 14:21

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 6, 2025

hustxiayang and others added 5 commits November 6, 2025 09:22

fix-usage-chunk

85d9cc3

Signed-off-by: yxia216 <yxia216@bloomberg.net>

docs: fix example in examples/inference-pool (envoyproxy#1488)

a240544

**Description** fix: envoyproxy#1485 **Related Issues/PRs (if applicable)** **Special notes for reviewers (if applicable)** Signed-off-by: googs1025 <googs1025@gmail.com> Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang force-pushed the gemini-usage branch from 3fc0bf1 to 35bff4c Compare November 6, 2025 14:22

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Nov 6, 2025

hustxiayang changed the title ~~fix: make usage chunk similar to Openai.~~ fix: make usage chunk in stream mode of gemini compatible with openai Nov 6, 2025

Merge branch 'main' into gemini-usage

d9e5df7

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Nov 6, 2025

fix-tests

66eaa8c

Signed-off-by: yxia216 <yxia216@bloomberg.net>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: make usage chunk in stream mode of gemini compatible with openai #1503

fix: make usage chunk in stream mode of gemini compatible with openai #1503

Uh oh!

hustxiayang commented Nov 6, 2025

Uh oh!

codecov-commenter commented Nov 6, 2025 •

edited

Loading

Uh oh!

hustxiayang commented Nov 6, 2025

Uh oh!

nacx commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fix: make usage chunk in stream mode of gemini compatible with openai #1503

Are you sure you want to change the base?

fix: make usage chunk in stream mode of gemini compatible with openai #1503

Uh oh!

Conversation

hustxiayang commented Nov 6, 2025

Uh oh!

codecov-commenter commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hustxiayang commented Nov 6, 2025

Uh oh!

nacx commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov-commenter commented Nov 6, 2025 •

edited

Loading