-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
(Feat) Add Vertex AI Live API WebSocket Passthrough with Cost Tracking #14956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
|
) | ||
|
||
verbose_proxy_logger.debug( | ||
f"Vertex AI Live API model info for '{model}': {model_info}" |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
The best way to fix this issue is to prevent logging sensitive or untrusted information in its entirety. Specifically:
- Scrub or redact sensitive fields from
model_info
before logging. Do not log the full dictionary unless every field is verified to be safe for logging. - For debug statements, log only the minimal information needed for troubleshooting—such as model name and key pricing info—while redacting or omitting user secrets, passwords, API keys, or metadata that might be sensitive.
- Replace the offending debug log (line 158) with one that either omits the sensitive parts or prints only the safe subset of information, ideally using a whitelist of known safe fields.
- Ensure the fix is localised: Only modify the snippet in
vertex_ai_live_passthrough_logging_handler.py
, and only the specific debug line(s), leaving the rest of the business logic untouched. - You may introduce helper methods to redact sensitive keys in the dict if needed, provided they’re standard and not reliant on external packages.
-
Copy modified lines R157-R159 -
Copy modified line R161
@@ -154,8 +154,11 @@ | ||
model=model, custom_llm_provider=custom_llm_provider | ||
) | ||
|
||
# Avoid logging full model_info as it may contain sensitive info. | ||
safe_fields = ["input_cost_per_token", "output_cost_per_token", "model"] | ||
safe_model_info = {k: model_info.get(k) for k in safe_fields if k in model_info} | ||
verbose_proxy_logger.debug( | ||
f"Vertex AI Live API model info for '{model}': {model_info}" | ||
f"Vertex AI Live API model info for '{model}': {safe_model_info}" | ||
) | ||
|
||
# Check if pricing info is available |
# Check if pricing info is available | ||
if not model_info or not model_info.get("input_cost_per_token"): | ||
verbose_proxy_logger.error( | ||
f"No pricing info found for {model} in local model pricing database" |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
To fix the problem, we must ensure that any log line writing the value of model
(which is ultimately derived from untrusted user input via API request metadata) does not log it in cleartext unless it is guaranteed to be safe. The best approach here is to validate that the model
string matches a well-known pattern for safe model names (e.g., only allowing certain characters: alphanumeric, dash, underscore, dot, colon), and if not, redact the logged value.
The log line in question is:
verbose_proxy_logger.error(
f"No pricing info found for {model} in local model pricing database"
)
We should add a sanitizing step here similar to what is already present for debug logging in the same file (see lines 376-378): use a regex to check if model
matches allowed patterns, and otherwise log [REDACTED]
.
Specifically, in litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
:
- In
_calculate_live_api_cost
, before the error logging on line 164, define a safe variablesafe_model
(using the same regex and logic as in line 377-378) - Replace
{model}
in the error log with{safe_model}
No additional dependencies are required; use the standard re
library (as already imported elsewhere in the same file).
-
Copy modified lines R163-R165 -
Copy modified line R167
@@ -160,8 +160,11 @@ | ||
|
||
# Check if pricing info is available | ||
if not model_info or not model_info.get("input_cost_per_token"): | ||
import re | ||
allowed_pattern = re.compile(r"^[A-Za-z0-9._\\-:]+$") | ||
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]" | ||
verbose_proxy_logger.error( | ||
f"No pricing info found for {model} in local model pricing database" | ||
f"No pricing info found for {safe_model} in local model pricing database" | ||
) | ||
return 0.0 | ||
|
f"Vertex AI Live API cost calculation - Model: {model}, " | ||
f"Prompt tokens: {prompt_token_count}, " | ||
f"Candidate tokens: {candidates_token_count}, " | ||
f"Total cost: ${total_cost:.6f}" |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
To address this, we must ensure that any user-provided data, especially fields like model
that could be derived from API keys or user metadata, are not logged directly, or are logged only after ensuring they don't contain sensitive information.
The mechanism is:
- Redact, filter, or validate any potentially sensitive user-supplied data before logging it.
- Replace the problematic log statement so that the value logged for
model
is either strictly validated (against a whitelist/regex of safe values) or replaced with a constant, such as[REDACTED]
, if it does not match the safe pattern. - Implement this fix directly where
model
is logged, specifically at line 237. - If similar logging happens elsewhere (especially elsewhere in this file, or related files, involving
model
or other tainted values), apply the same validation/redaction logic.
In this case, the best approach is to:
- Before logging, ensure
model
only contains non-sensitive, validated data (e.g., matches a pattern for known model names). - Use a regex pattern (e.g., allowing only alphanumeric, dashes, underscores, periods, and colons).
- Replace the value with
[REDACTED]
if the validation fails. - This approach ensures logs are still informative when safe, but never expose sensitive data.
-
Copy modified lines R236-R239 -
Copy modified line R241
@@ -233,8 +233,12 @@ | ||
# Fallback to token-based pricing for tool use | ||
total_cost += tool_use_prompt_token_count * input_cost_per_token | ||
|
||
# Safely log the model name: only allow known safe formats, redact otherwise. | ||
import re | ||
allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$") | ||
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]" | ||
verbose_proxy_logger.debug( | ||
f"Vertex AI Live API cost calculation - Model: {model}, " | ||
f"Vertex AI Live API cost calculation - Model: {safe_model}, " | ||
f"Prompt tokens: {prompt_token_count}, " | ||
f"Candidate tokens: {candidates_token_count}, " | ||
f"Total cost: ${total_cost:.6f}" |
model = kwargs.get("model", "gemini-2.0-flash-live-preview-04-09") | ||
custom_llm_provider = kwargs.get("custom_llm_provider", "vertex_ai") | ||
verbose_proxy_logger.debug( | ||
f"Vertex AI Live API model: {model}, custom_llm_provider: {custom_llm_provider}" |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
Sensitive information should never be logged, even if a variable typically contains a non-sensitive value. To fix the problem, update the logging on line 330 to use a sanitized or redacted value for each potentially sensitive field.
Specifically:
- For
model
andcustom_llm_provider
, apply the same safety checks as used later on (the regex pattern to ensure they match only known-safe formats; otherwise, redact them). - Move the safety logic (the regex check and redaction) above any logging that refers to these values.
- Update the log line on 330 to use
safe_model
andsafe_custom_llm_provider
, not the original, possibly tainted values.
No new imports are required beyond import re
(already shown in the snippet for the safety logging); just ensure that all log entries referring to potentially tainted/sensitive data use the sanitized version.
Only edit the shown region in litellm/proxy/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
.
-
Copy modified lines R329-R337 -
Copy modified line R339
@@ -326,8 +326,17 @@ | ||
# Extract model from request body or kwargs | ||
model = kwargs.get("model", "gemini-2.0-flash-live-preview-04-09") | ||
custom_llm_provider = kwargs.get("custom_llm_provider", "vertex_ai") | ||
# Apply safety checks to avoid cleartext logging of sensitive data | ||
import re | ||
allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$") | ||
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]" | ||
safe_custom_llm_provider = ( | ||
custom_llm_provider | ||
if isinstance(custom_llm_provider, str) and allowed_pattern.match(custom_llm_provider) | ||
else "[REDACTED]" | ||
) | ||
verbose_proxy_logger.debug( | ||
f"Vertex AI Live API model: {model}, custom_llm_provider: {custom_llm_provider}" | ||
f"Vertex AI Live API model: {safe_model}, custom_llm_provider: {safe_custom_llm_provider}" | ||
) | ||
|
||
# Extract usage metadata from WebSocket messages |
...y/pass_through_endpoints/llm_provider_handlers/vertex_ai_live_passthrough_logging_handler.py
Fixed
Show fixed
Hide fixed
… sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
f"Vertex AI Live API passthrough cost tracking - " | ||
f"Model: {safe_model}, Cost: ${response_cost:.6f}, " | ||
f"Prompt tokens: {usage.prompt_tokens}, " | ||
f"Completion tokens: {usage.completion_tokens}" |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
To fix this issue, we must ensure that no sensitive information from user API keys (or any other secrets) ends up logged in clear text. To do so, sanitize the logging further:
- Only log model values if they are known safe (as system identifiers, not user-provided secrets).
- Otherwise, always redact them, even if matching the regex, unless a whitelist of known safe model names is used.
- Additionally, review the flow:
kwargs
can be polluted from higher up, so restrict whatmodel
can be logged. - Prefer conservative handling: Even if a string matches the allowed pattern, if it comes from a potentially sensitive metadata field, redact.
Therefore, modify the logging block (lines 375–384) in vertex_ai_live_passthrough_logging_handler.py
:
- Add a check to ensure that the model is only logged if it is from a well-known, safe set of model names, such as those found in a maintained set/list.
- If not, log as
[REDACTED]
. - Do not log any other fields from user-supplied metadata.
- Do not log contents of kwargs, request_body, or similar fields.
No extra imports are needed beyond standard library usage. Ensure only system-controlled model names are ever logged.
-
Copy modified lines R375-R380
@@ -372,10 +372,12 @@ | ||
kwargs["model"] = model | ||
kwargs["custom_llm_provider"] = custom_llm_provider | ||
|
||
# Safely log the model name: only allow known safe formats, redact otherwise. | ||
import re | ||
allowed_pattern = re.compile(r"^[A-Za-z0-9._\-:]+$") | ||
safe_model = model if isinstance(model, str) and allowed_pattern.match(model) else "[REDACTED]" | ||
# Only log model names from a known-safe whitelist; redact otherwise. | ||
SAFE_MODEL_WHITELIST = { | ||
"gemini-2.0-flash-live-preview-04-09", | ||
# Add other approved model names here as needed | ||
} | ||
safe_model = model if isinstance(model, str) and model in SAFE_MODEL_WHITELIST else "[REDACTED]" | ||
verbose_proxy_logger.debug( | ||
f"Vertex AI Live API passthrough cost tracking - " | ||
f"Model: {safe_model}, Cost: ${response_cost:.6f}, " |
merge main
@Sameerlite is this ready to merge? |
Title
Add Vertex AI Live API WebSocket Passthrough with Cost Tracking
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/
directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit
Type
🆕 New Feature
Changes
Overview
This PR adds comprehensive support for Vertex AI Live API WebSocket passthrough with advanced cost tracking and logging capabilities. The implementation includes a dedicated logging handler, WebSocket passthrough functionality, and comprehensive test coverage.
Key Features Added
1. Vertex AI Live Passthrough Logging Handler (
vertex_ai_live_passthrough_logging_handler.py
)2. WebSocket Passthrough Integration (
llm_passthrough_endpoints.py
)/vertex_ai/live
WebSocket endpoint3. Success Handler Integration (
success_handler.py
)is_vertex_ai_live_route()
method for route identification4. WebSocket Passthrough Infrastructure (
pass_through_endpoints.py
)This implementation provides a robust foundation for Vertex AI Live API WebSocket passthrough with comprehensive cost tracking, making it easy for users to integrate real-time AI capabilities while maintaining full visibility into usage and costs.