(Feat) Add Vertex AI Live API WebSocket Passthrough with Cost Tracking #14956

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

Sameerlite wants to merge 8 commits into main from litellm_live_api_passthrough

Collaborator

Sameerlite commented Sep 26, 2025

Title

Add Vertex AI Live API WebSocket Passthrough with Cost Tracking

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature

Changes

Overview

This PR adds comprehensive support for Vertex AI Live API WebSocket passthrough with advanced cost tracking and logging capabilities. The implementation includes a dedicated logging handler, WebSocket passthrough functionality, and comprehensive test coverage.

Key Features Added

1. Vertex AI Live Passthrough Logging Handler (`vertex_ai_live_passthrough_logging_handler.py`)

Usage Metadata Extraction: Aggregates token usage from multiple WebSocket messages
Multimodal Support: Handles TEXT, AUDIO, and VIDEO token tracking with separate cost calculations
Web Search Integration: Tracks tool use tokens for web search functionality
Cost Calculation: Advanced cost calculation supporting different modalities and pricing models
Error Handling: Graceful handling of invalid inputs and missing data

2. WebSocket Passthrough Integration (`llm_passthrough_endpoints.py`)

Route Registration: Added /vertex_ai/live WebSocket endpoint
Handler Integration: Seamless integration with existing passthrough infrastructure
Credential Management: Proper handling of Vertex AI credentials for WebSocket connections

3. Success Handler Integration (`success_handler.py`)

Route Detection: Added is_vertex_ai_live_route() method for route identification
WebSocket Message Processing: Handles WebSocket message format conversion
Logging Integration: Proper integration with existing logging infrastructure

4. WebSocket Passthrough Infrastructure (`pass_through_endpoints.py`)

WebSocket Support: Enhanced WebSocket passthrough functionality
Message Processing: Real-time processing of WebSocket messages
Error Handling: Comprehensive error handling for WebSocket connections

This implementation provides a robust foundation for Vertex AI Live API WebSocket passthrough with comprehensive cost tracking, making it easy for users to integrate real-time AI capabilities while maintaining full visibility into usage and costs.

Sameerlite added 2 commits

September 25, 2025 22:40


          initial int live api

9d4eb81


          Add vertex live api passthrough with cost tracking

67e7ad5

vercel bot commented Sep 26, 2025 •

edited

Loading

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
litellm	Ready	Preview	Comment	Sep 26, 2025 8:39pm


          fix lint

66cf281

vercel bot had a problem deploying to Preview

September 26, 2025 19:33

Failure

github-advanced-security bot found potential problems

View reviewed changes

...y/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

+                          )
+                          verbose_proxy_logger.debug(
+                              f"Vertex AI Live API model info for '{model}': {model_info}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs

sensitive data (password)

as clear text.

Copilot Autofix

AI 1 day ago

The best way to fix this issue is to prevent logging sensitive or untrusted information in its entirety. Specifically:

Scrub or redact sensitive fields from model_info before logging. Do not log the full dictionary unless every field is verified to be safe for logging.
For debug statements, log only the minimal information needed for troubleshooting—such as model name and key pricing info—while redacting or omitting user secrets, passwords, API keys, or metadata that might be sensitive.
Replace the offending debug log (line 158) with one that either omits the sensitive parts or prints only the safe subset of information, ideally using a whitelist of known safe fields.
Ensure the fix is localised: Only modify the snippet in vertex_ai_live_passthrough_logging_handler.py, and only the specific debug line(s), leaving the rest of the business logic untouched.
You may introduce helper methods to redact sensitive keys in the dict if needed, provided they’re standard and not reliant on external packages.

Suggested changeset 1

litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

@@ -154,8 +154,11 @@
                             model=model, custom_llm_provider=custom_llm_provider
                         )
+                        # Avoid logging full model_info as it may contain sensitive info.
+                        safe_fields = ["input_cost_per_token", "output_cost_per_token", "model"]
+                        safe_model_info = {k: model_info.get(k) for k in safe_fields if k in model_info}
                         verbose_proxy_logger.debug(
-                            f"Vertex AI Live API model info for '{model}': {model_info}"
+                            f"Vertex AI Live API model info for '{model}': {safe_model_info}"
                         )
                         # Check if pricing info is available

Copilot is powered by AI and may make mistakes. Always verify output.

...y/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

+                          # Check if pricing info is available
+                          if not model_info or not model_info.get("input_cost_per_token"):
+                              verbose_proxy_logger.error(
+                                  f"No pricing info found for {model} in local model pricing database"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs

sensitive data (password)

as clear text.

Copilot Autofix

AI 1 day ago

To fix the problem, we must ensure that any log line writing the value of model (which is ultimately derived from untrusted user input via API request metadata) does not log it in cleartext unless it is guaranteed to be safe. The best approach here is to validate that the model string matches a well-known pattern for safe model names (e.g., only allowing certain characters: alphanumeric, dash, underscore, dot, colon), and if not, redact the logged value.

The log line in question is:

verbose_proxy_logger.error(
    f"No pricing info found for {model} in local model pricing database"
)

We should add a sanitizing step here similar to what is already present for debug logging in the same file (see lines 376-378): use a regex to check if model matches allowed patterns, and otherwise log [REDACTED].

Specifically, in litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py:

In _calculate_live_api_cost, before the error logging on line 164, define a safe variable safe_model (using the same regex and logic as in line 377-378)
Replace {model} in the error log with {safe_model}

No additional dependencies are required; use the standard re library (as already imported elsewhere in the same file).

Suggested changeset 1

litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

@@ -160,8 +160,11 @@
                         # Check if pricing info is available
                         if not model_info or not model_info.get("input_cost_per_token"):
+                            import re
+                            allowed_pattern = re.compile(r"^[A-Za-z0-9._\\-:]+$")
+                            safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
                             verbose_proxy_logger.error(
-                                f"No pricing info found for {model} in local model pricing database"
+                                f"No pricing info found for {safe_model} in local model pricing database"
                             )
                             return 0.0

Copilot is powered by AI and may make mistakes. Always verify output.

...y/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

Comment on lines +237 to +240

+                              f"Vertex AI Live API cost calculation - Model: {model}, "
+                              f"Prompt tokens: {prompt_token_count}, "
+                              f"Candidate tokens: {candidates_token_count}, "
+                              f"Total cost: ${total_cost:.6f}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs

sensitive data (password)

as clear text.

Copilot Autofix

AI 1 day ago

To address this, we must ensure that any user-provided data, especially fields like model that could be derived from API keys or user metadata, are not logged directly, or are logged only after ensuring they don't contain sensitive information.

The mechanism is:

Redact, filter, or validate any potentially sensitive user-supplied data before logging it.
Replace the problematic log statement so that the value logged for model is either strictly validated (against a whitelist/regex of safe values) or replaced with a constant, such as [REDACTED], if it does not match the safe pattern.
Implement this fix directly where model is logged, specifically at line 237.
If similar logging happens elsewhere (especially elsewhere in this file, or related files, involving model or other tainted values), apply the same validation/redaction logic.

In this case, the best approach is to:

Before logging, ensure model only contains non-sensitive, validated data (e.g., matches a pattern for known model names).
Use a regex pattern (e.g., allowing only alphanumeric, dashes, underscores, periods, and colons).
Replace the value with [REDACTED] if the validation fails.
This approach ensures logs are still informative when safe, but never expose sensitive data.

Suggested changeset 1

litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

@@ -233,8 +233,12 @@
                                 # Fallback to token-based pricing for tool use
                                 total_cost += tool_use_prompt_token_count * input_cost_per_token
+                        # Safely log the model name: only allow known safe formats, redact otherwise.
+                        import re
+                        allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
+                        safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
                         verbose_proxy_logger.debug(
-                            f"Vertex AI Live API cost calculation - Model: {model}, "
+                            f"Vertex AI Live API cost calculation - Model: {safe_model}, "
                             f"Prompt tokens: {prompt_token_count}, "
                             f"Candidate tokens: {candidates_token_count}, "
                             f"Total cost: ${total_cost:.6f}"

Copilot is powered by AI and may make mistakes. Always verify output.

...y/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

+                          model = kwargs.get("model", "gemini-2.0-flash-live-preview-04-09")
+                          custom_llm_provider = kwargs.get("custom_llm_provider", "vertex_ai")
+                          verbose_proxy_logger.debug(
+                              f"Vertex AI Live API model: {model}, custom_llm_provider: {custom_llm_provider}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs

sensitive data (password)

as clear text.

Copilot Autofix

AI 1 day ago

Sensitive information should never be logged, even if a variable typically contains a non-sensitive value. To fix the problem, update the logging on line 330 to use a sanitized or redacted value for each potentially sensitive field.

Specifically:

For model and custom_llm_provider, apply the same safety checks as used later on (the regex pattern to ensure they match only known-safe formats; otherwise, redact them).
Move the safety logic (the regex check and redaction) above any logging that refers to these values.
Update the log line on 330 to use safe_model and safe_custom_llm_provider, not the original, possibly tainted values.

No new imports are required beyond import re (already shown in the snippet for the safety logging); just ensure that all log entries referring to potentially tainted/sensitive data use the sanitized version.

Only edit the shown region in litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py.

Suggested changeset 1

litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

@@ -326,8 +326,17 @@
                         # Extract model from request body or kwargs
                         model = kwargs.get("model", "gemini-2.0-flash-live-preview-04-09")
                         custom_llm_provider = kwargs.get("custom_llm_provider", "vertex_ai")
+                        # Apply safety checks to avoid cleartext logging of sensitive data
+                        import re
+                        allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
+                        safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
+                        safe_custom_llm_provider = (
+                            custom_llm_provider
+                            if isinstance(custom_llm_provider, str) and allowed_pattern.match(custom_llm_provider)
+                            else "[REDACTED]"
+                        )
                         verbose_proxy_logger.debug(
-                            f"Vertex AI Live API model: {model}, custom_llm_provider: {custom_llm_provider}"
+                            f"Vertex AI Live API model: {safe_model}, custom_llm_provider: {safe_custom_llm_provider}"
                         )
                         # Extract usage metadata from WebSocket messages

Copilot is powered by AI and may make mistakes. Always verify output.

...y/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py Fixed Show fixed Hide fixed


          fix lint

61a450f

vercel bot had a problem deploying to Preview

September 26, 2025 19:46

Failure


          Potential fix for code scanning alert no. 3413: Clear-text logging of…

3dac7e2

… sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

vercel bot had a problem deploying to Preview

September 26, 2025 19:48

Failure

github-advanced-security bot found potential problems

View reviewed changes

...y/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

Comment on lines +380 to +383

+                              f"Vertex AI Live API passthrough cost tracking - "
+                              f"Model: {safe_model}, Cost: ${response_cost:.6f}, "
+                              f"Prompt tokens: {usage.prompt_tokens}, "
+                              f"Completion tokens: {usage.completion_tokens}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs

sensitive data (password)

as clear text.

Copilot Autofix

AI 1 day ago

To fix this issue, we must ensure that no sensitive information from user API keys (or any other secrets) ends up logged in clear text. To do so, sanitize the logging further:

Only log model values if they are known safe (as system identifiers, not user-provided secrets).
Otherwise, always redact them, even if matching the regex, unless a whitelist of known safe model names is used.
Additionally, review the flow: kwargs can be polluted from higher up, so restrict what model can be logged.
Prefer conservative handling: Even if a string matches the allowed pattern, if it comes from a potentially sensitive metadata field, redact.

Therefore, modify the logging block (lines 375–384) in vertex_ai_live_passthrough_logging_handler.py:

Add a check to ensure that the model is only logged if it is from a well-known, safe set of model names, such as those found in a maintained set/list.
If not, log as [REDACTED].
Do not log any other fields from user-supplied metadata.
Do not log contents of kwargs, request_body, or similar fields.

No extra imports are needed beyond standard library usage. Ensure only system-controlled model names are ever logged.

Suggested changeset 1

litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

@@ -372,10 +372,12 @@
                         kwargs["model"] = model
                         kwargs["custom_llm_provider"] = custom_llm_provider
-                        # Safely log the model name: only allow known safe formats, redact otherwise.
-                        import re
-                        allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
-                        safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
+                        # Only log model names from a known-safe whitelist; redact otherwise.
+                        SAFE_MODEL_WHITELIST = {
+                            "gemini-2.0-flash-live-preview-04-09",
+                            # Add other approved model names here as needed
+                        }
+                        safe_model = model if isinstance(model, str) and model in SAFE_MODEL_WHITELIST else "[REDACTED]"
                         verbose_proxy_logger.debug(
                             f"Vertex AI Live API passthrough cost tracking - "
                             f"Model: {safe_model}, Cost: ${response_cost:.6f}, "

Copilot is powered by AI and may make mistakes. Always verify output.


          Merge pull request #14957 from BerriAI/main

94edbd1

merge main

vercel bot deployed to Preview

September 26, 2025 20:26

View deployment

Sameerlite added 2 commits

September 27, 2025 02:02


          fix mypy errors

92cb34e


          fix test

ce0b815

vercel bot deployed to Preview

September 26, 2025 20:39

View deployment

Contributor

krrishdholakia commented Sep 27, 2025

@Sameerlite is this ready to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet