Skip to content

Conversation

Sameerlite
Copy link
Collaborator

Title

Add Vertex AI Live API WebSocket Passthrough with Cost Tracking

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature

Changes

Overview

This PR adds comprehensive support for Vertex AI Live API WebSocket passthrough with advanced cost tracking and logging capabilities. The implementation includes a dedicated logging handler, WebSocket passthrough functionality, and comprehensive test coverage.

Key Features Added

1. Vertex AI Live Passthrough Logging Handler (vertex_ai_live_passthrough_logging_handler.py)

  • Usage Metadata Extraction: Aggregates token usage from multiple WebSocket messages
  • Multimodal Support: Handles TEXT, AUDIO, and VIDEO token tracking with separate cost calculations
  • Web Search Integration: Tracks tool use tokens for web search functionality
  • Cost Calculation: Advanced cost calculation supporting different modalities and pricing models
  • Error Handling: Graceful handling of invalid inputs and missing data

2. WebSocket Passthrough Integration (llm_passthrough_endpoints.py)

  • Route Registration: Added /vertex_ai/live WebSocket endpoint
  • Handler Integration: Seamless integration with existing passthrough infrastructure
  • Credential Management: Proper handling of Vertex AI credentials for WebSocket connections

3. Success Handler Integration (success_handler.py)

  • Route Detection: Added is_vertex_ai_live_route() method for route identification
  • WebSocket Message Processing: Handles WebSocket message format conversion
  • Logging Integration: Proper integration with existing logging infrastructure

4. WebSocket Passthrough Infrastructure (pass_through_endpoints.py)

  • WebSocket Support: Enhanced WebSocket passthrough functionality
  • Message Processing: Real-time processing of WebSocket messages
  • Error Handling: Comprehensive error handling for WebSocket connections
image

This implementation provides a robust foundation for Vertex AI Live API WebSocket passthrough with comprehensive cost tracking, making it easy for users to integrate real-time AI capabilities while maintaining full visibility into usage and costs.

Copy link

vercel bot commented Sep 26, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Sep 26, 2025 8:39pm

)

verbose_proxy_logger.debug(
f"Vertex AI Live API model info for '{model}': {model_info}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 1 day ago

The best way to fix this issue is to prevent logging sensitive or untrusted information in its entirety. Specifically:

  • Scrub or redact sensitive fields from model_info before logging. Do not log the full dictionary unless every field is verified to be safe for logging.
  • For debug statements, log only the minimal information needed for troubleshooting—such as model name and key pricing info—while redacting or omitting user secrets, passwords, API keys, or metadata that might be sensitive.
  • Replace the offending debug log (line 158) with one that either omits the sensitive parts or prints only the safe subset of information, ideally using a whitelist of known safe fields.
  • Ensure the fix is localised: Only modify the snippet in vertex_ai_live_passthrough_logging_handler.py, and only the specific debug line(s), leaving the rest of the business logic untouched.
  • You may introduce helper methods to redact sensitive keys in the dict if needed, provided they’re standard and not reliant on external packages.

Suggested changeset 1
litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
--- a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
+++ b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
@@ -154,8 +154,11 @@
                 model=model, custom_llm_provider=custom_llm_provider
             )
 
+            # Avoid logging full model_info as it may contain sensitive info.
+            safe_fields = ["input_cost_per_token", "output_cost_per_token", "model"]
+            safe_model_info = {k: model_info.get(k) for k in safe_fields if k in model_info}
             verbose_proxy_logger.debug(
-                f"Vertex AI Live API model info for '{model}': {model_info}"
+                f"Vertex AI Live API model info for '{model}': {safe_model_info}"
             )
 
             # Check if pricing info is available
EOF
@@ -154,8 +154,11 @@
model=model, custom_llm_provider=custom_llm_provider
)

# Avoid logging full model_info as it may contain sensitive info.
safe_fields = ["input_cost_per_token", "output_cost_per_token", "model"]
safe_model_info = {k: model_info.get(k) for k in safe_fields if k in model_info}
verbose_proxy_logger.debug(
f"Vertex AI Live API model info for '{model}': {model_info}"
f"Vertex AI Live API model info for '{model}': {safe_model_info}"
)

# Check if pricing info is available
Copilot is powered by AI and may make mistakes. Always verify output.
# Check if pricing info is available
if not model_info or not model_info.get("input_cost_per_token"):
verbose_proxy_logger.error(
f"No pricing info found for {model} in local model pricing database"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 1 day ago

To fix the problem, we must ensure that any log line writing the value of model (which is ultimately derived from untrusted user input via API request metadata) does not log it in cleartext unless it is guaranteed to be safe. The best approach here is to validate that the model string matches a well-known pattern for safe model names (e.g., only allowing certain characters: alphanumeric, dash, underscore, dot, colon), and if not, redact the logged value.

The log line in question is:

verbose_proxy_logger.error(
    f"No pricing info found for {model} in local model pricing database"
)

We should add a sanitizing step here similar to what is already present for debug logging in the same file (see lines 376-378): use a regex to check if model matches allowed patterns, and otherwise log [REDACTED].

Specifically, in litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py:

  • In _calculate_live_api_cost, before the error logging on line 164, define a safe variable safe_model (using the same regex and logic as in line 377-378)
  • Replace {model} in the error log with {safe_model}

No additional dependencies are required; use the standard re library (as already imported elsewhere in the same file).


Suggested changeset 1
litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
--- a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
+++ b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
@@ -160,8 +160,11 @@
 
             # Check if pricing info is available
             if not model_info or not model_info.get("input_cost_per_token"):
+                import re
+                allowed_pattern = re.compile(r"^[A-Za-z0-9._\\-:]+$")
+                safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
                 verbose_proxy_logger.error(
-                    f"No pricing info found for {model} in local model pricing database"
+                    f"No pricing info found for {safe_model} in local model pricing database"
                 )
                 return 0.0
 
EOF
@@ -160,8 +160,11 @@

# Check if pricing info is available
if not model_info or not model_info.get("input_cost_per_token"):
import re
allowed_pattern = re.compile(r"^[A-Za-z0-9._\\-:]+$")
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
verbose_proxy_logger.error(
f"No pricing info found for {model} in local model pricing database"
f"No pricing info found for {safe_model} in local model pricing database"
)
return 0.0

Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +237 to +240
f"Vertex AI Live API cost calculation - Model: {model}, "
f"Prompt tokens: {prompt_token_count}, "
f"Candidate tokens: {candidates_token_count}, "
f"Total cost: ${total_cost:.6f}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 1 day ago

To address this, we must ensure that any user-provided data, especially fields like model that could be derived from API keys or user metadata, are not logged directly, or are logged only after ensuring they don't contain sensitive information.

The mechanism is:

  • Redact, filter, or validate any potentially sensitive user-supplied data before logging it.
  • Replace the problematic log statement so that the value logged for model is either strictly validated (against a whitelist/regex of safe values) or replaced with a constant, such as [REDACTED], if it does not match the safe pattern.
  • Implement this fix directly where model is logged, specifically at line 237.
  • If similar logging happens elsewhere (especially elsewhere in this file, or related files, involving model or other tainted values), apply the same validation/redaction logic.

In this case, the best approach is to:

  • Before logging, ensure model only contains non-sensitive, validated data (e.g., matches a pattern for known model names).
  • Use a regex pattern (e.g., allowing only alphanumeric, dashes, underscores, periods, and colons).
  • Replace the value with [REDACTED] if the validation fails.
  • This approach ensures logs are still informative when safe, but never expose sensitive data.

Suggested changeset 1
litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
--- a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
+++ b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
@@ -233,8 +233,12 @@
                     # Fallback to token-based pricing for tool use
                     total_cost += tool_use_prompt_token_count * input_cost_per_token
 
+            # Safely log the model name: only allow known safe formats, redact otherwise.
+            import re
+            allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
+            safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
             verbose_proxy_logger.debug(
-                f"Vertex AI Live API cost calculation - Model: {model}, "
+                f"Vertex AI Live API cost calculation - Model: {safe_model}, "
                 f"Prompt tokens: {prompt_token_count}, "
                 f"Candidate tokens: {candidates_token_count}, "
                 f"Total cost: ${total_cost:.6f}"
EOF
@@ -233,8 +233,12 @@
# Fallback to token-based pricing for tool use
total_cost += tool_use_prompt_token_count * input_cost_per_token

# Safely log the model name: only allow known safe formats, redact otherwise.
import re
allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
verbose_proxy_logger.debug(
f"Vertex AI Live API cost calculation - Model: {model}, "
f"Vertex AI Live API cost calculation - Model: {safe_model}, "
f"Prompt tokens: {prompt_token_count}, "
f"Candidate tokens: {candidates_token_count}, "
f"Total cost: ${total_cost:.6f}"
Copilot is powered by AI and may make mistakes. Always verify output.
model = kwargs.get("model", "gemini-2.0-flash-live-preview-04-09")
custom_llm_provider = kwargs.get("custom_llm_provider", "vertex_ai")
verbose_proxy_logger.debug(
f"Vertex AI Live API model: {model}, custom_llm_provider: {custom_llm_provider}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 1 day ago

Sensitive information should never be logged, even if a variable typically contains a non-sensitive value. To fix the problem, update the logging on line 330 to use a sanitized or redacted value for each potentially sensitive field.

Specifically:

  • For model and custom_llm_provider, apply the same safety checks as used later on (the regex pattern to ensure they match only known-safe formats; otherwise, redact them).
  • Move the safety logic (the regex check and redaction) above any logging that refers to these values.
  • Update the log line on 330 to use safe_model and safe_custom_llm_provider, not the original, possibly tainted values.

No new imports are required beyond import re (already shown in the snippet for the safety logging); just ensure that all log entries referring to potentially tainted/sensitive data use the sanitized version.

Only edit the shown region in litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py.


Suggested changeset 1
litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
--- a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
+++ b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
@@ -326,8 +326,17 @@
             # Extract model from request body or kwargs
             model = kwargs.get("model", "gemini-2.0-flash-live-preview-04-09")
             custom_llm_provider = kwargs.get("custom_llm_provider", "vertex_ai")
+            # Apply safety checks to avoid cleartext logging of sensitive data
+            import re
+            allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
+            safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
+            safe_custom_llm_provider = (
+                custom_llm_provider
+                if isinstance(custom_llm_provider, str) and allowed_pattern.match(custom_llm_provider)
+                else "[REDACTED]"
+            )
             verbose_proxy_logger.debug(
-                f"Vertex AI Live API model: {model}, custom_llm_provider: {custom_llm_provider}"
+                f"Vertex AI Live API model: {safe_model}, custom_llm_provider: {safe_custom_llm_provider}"
             )
 
             # Extract usage metadata from WebSocket messages
EOF
@@ -326,8 +326,17 @@
# Extract model from request body or kwargs
model = kwargs.get("model", "gemini-2.0-flash-live-preview-04-09")
custom_llm_provider = kwargs.get("custom_llm_provider", "vertex_ai")
# Apply safety checks to avoid cleartext logging of sensitive data
import re
allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
safe_custom_llm_provider = (
custom_llm_provider
if isinstance(custom_llm_provider, str) and allowed_pattern.match(custom_llm_provider)
else "[REDACTED]"
)
verbose_proxy_logger.debug(
f"Vertex AI Live API model: {model}, custom_llm_provider: {custom_llm_provider}"
f"Vertex AI Live API model: {safe_model}, custom_llm_provider: {safe_custom_llm_provider}"
)

# Extract usage metadata from WebSocket messages
Copilot is powered by AI and may make mistakes. Always verify output.
… sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Comment on lines +380 to +383
f"Vertex AI Live API passthrough cost tracking - "
f"Model: {safe_model}, Cost: ${response_cost:.6f}, "
f"Prompt tokens: {usage.prompt_tokens}, "
f"Completion tokens: {usage.completion_tokens}"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 1 day ago

To fix this issue, we must ensure that no sensitive information from user API keys (or any other secrets) ends up logged in clear text. To do so, sanitize the logging further:

  • Only log model values if they are known safe (as system identifiers, not user-provided secrets).
  • Otherwise, always redact them, even if matching the regex, unless a whitelist of known safe model names is used.
  • Additionally, review the flow: kwargs can be polluted from higher up, so restrict what model can be logged.
  • Prefer conservative handling: Even if a string matches the allowed pattern, if it comes from a potentially sensitive metadata field, redact.

Therefore, modify the logging block (lines 375–384) in vertex_ai_live_passthrough_logging_handler.py:

  • Add a check to ensure that the model is only logged if it is from a well-known, safe set of model names, such as those found in a maintained set/list.
  • If not, log as [REDACTED].
  • Do not log any other fields from user-supplied metadata.
  • Do not log contents of kwargs, request_body, or similar fields.

No extra imports are needed beyond standard library usage. Ensure only system-controlled model names are ever logged.

Suggested changeset 1
litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
--- a/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
+++ b/litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
@@ -372,10 +372,12 @@
             kwargs["model"] = model
             kwargs["custom_llm_provider"] = custom_llm_provider
 
-            # Safely log the model name: only allow known safe formats, redact otherwise.
-            import re
-            allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
-            safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
+            # Only log model names from a known-safe whitelist; redact otherwise.
+            SAFE_MODEL_WHITELIST = {
+                "gemini-2.0-flash-live-preview-04-09",
+                # Add other approved model names here as needed
+            }
+            safe_model = model if isinstance(model, str) and model in SAFE_MODEL_WHITELIST else "[REDACTED]"
             verbose_proxy_logger.debug(
                 f"Vertex AI Live API passthrough cost tracking - "
                 f"Model: {safe_model}, Cost: ${response_cost:.6f}, "
EOF
@@ -372,10 +372,12 @@
kwargs["model"] = model
kwargs["custom_llm_provider"] = custom_llm_provider

# Safely log the model name: only allow known safe formats, redact otherwise.
import re
allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$")
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]"
# Only log model names from a known-safe whitelist; redact otherwise.
SAFE_MODEL_WHITELIST = {
"gemini-2.0-flash-live-preview-04-09",
# Add other approved model names here as needed
}
safe_model = model if isinstance(model, str) and model in SAFE_MODEL_WHITELIST else "[REDACTED]"
verbose_proxy_logger.debug(
f"Vertex AI Live API passthrough cost tracking - "
f"Model: {safe_model}, Cost: ${response_cost:.6f}, "
Copilot is powered by AI and may make mistakes. Always verify output.
@krrishdholakia
Copy link
Contributor

@Sameerlite is this ready to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants