Skip to content

Conversation

@hustxiayang
Copy link
Contributor

Description

Users found the simply use "usgae" information does not work for streaming responses of gemini models.
image

This is because for openai models, the usage chunk would be a separate chunk. For example, this is an example response from gpt-4o:

...

chunk=ChatCompletionChunk(id='chatcmpl-CYv6DPkWfT1xrsS2ySOoRztQKnZDg', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None, content_filter_result={'error': {'code': 'content_filter_error', 'message': 'The contents are not filtered'}})], created=1762438677, model='azure.gpt-4o', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_4a331a0222', usage=None, obfuscation='2xP')


chunk=ChatCompletionChunk(id='chatcmpl-CYv6DPkWfT1xrsS2ySOoRztQKnZDg', choices=[], created=1762438677, model='azure.gpt-4o', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_4a331a0222', usage=CompletionUsage(completion_tokens=17, prompt_tokens=12, total_tokens=29, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)), obfuscation='xY40HsJr')

There is a finish_reason chunk, and then a usage chunk.

Thus, want to make it compatible with Openai. (Actually, in anthropic translation, it's already compatible)

@hustxiayang hustxiayang requested a review from a team as a code owner November 6, 2025 14:21
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 6, 2025
hustxiayang and others added 5 commits November 6, 2025 09:22
Signed-off-by: yxia216 <yxia216@bloomberg.net>
…envoyproxy#1494)

**Description**

-  ai gateway mutating webhook should default failurePolicy to Fail

**Related Issues/PRs (if applicable)**

fixes: envoyproxy#1493

**Special notes for reviewers (if applicable)**

Signed-off-by: googs1025 <googs1025@gmail.com>
Signed-off-by: yxia216 <yxia216@bloomberg.net>
**Description**
fix: envoyproxy#1485

**Related Issues/PRs (if applicable)**

**Special notes for reviewers (if applicable)**

Signed-off-by: googs1025 <googs1025@gmail.com>
Signed-off-by: yxia216 <yxia216@bloomberg.net>
… a tool call (envoyproxy#1486)

**Description**

Finish reason should be tool calls if the model returns a tool call
response. In vertex api, there is no tool call finish reason, thus need
a work around to make it compatible.

---------

Signed-off-by: yxia216 <yxia216@bloomberg.net>
Co-authored-by: Dan Sun <dsun20@bloomberg.net>
Signed-off-by: yxia216 <yxia216@bloomberg.net>
…oxy#1491)

**Description**

This decouples backendauth & headermutator packages from extproc
specifics. As we are looking to migrate to dynamic modules, this is a
necessary refactoring work to make the code as reusable as possible.

**Related Issues/PRs (if applicable)**

Preliminary for envoyproxy#90

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: yxia216 <yxia216@bloomberg.net>
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Nov 6, 2025
@hustxiayang hustxiayang changed the title fix: make usage chunk similar to Openai. fix: make usage chunk in stream mode of gemini compatible with openai Nov 6, 2025
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Nov 6, 2025
@codecov-commenter
Copy link

codecov-commenter commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 80.95238% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.89%. Comparing base (a0e4c0e) to head (66eaa8c).

Files with missing lines Patch % Lines
internal/extproc/translator/openai_gcpvertexai.go 80.95% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1503      +/-   ##
==========================================
- Coverage   83.91%   83.89%   -0.03%     
==========================================
  Files         144      144              
  Lines       12659    12670      +11     
==========================================
+ Hits        10623    10629       +6     
- Misses       1419     1423       +4     
- Partials      617      618       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: yxia216 <yxia216@bloomberg.net>
@hustxiayang
Copy link
Contributor Author

/retest

@nacx
Copy link
Contributor

nacx commented Nov 11, 2025

@yuzisun can you take a look at this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants