-
Notifications
You must be signed in to change notification settings - Fork 808
Description
Which component is this feature for?
Bedrock Instrumentation
🔖 Feature description
Support tracking of prompt caching in Bedrock Converse.
Prompt caching telemetry for Bedrock Converse will be:
- Added to spans as Attributes
- Emitted to the
prompt_caching
Counter (or should this be a histogram?)
🎤 Why is this feature needed ?
Better cost and latency tracking.
Feature parity with #2788, which instruments _handle_call
and _handle_call_stream
, but not Converse functions: _handle_converse
or _handle_converse_stream
within __init__.py
✌️ How do you aim to achieve this?
Update prompt_caching.py
(Has prompt_caching_handling
but doesn't have prompt_caching_converse
)
Create function prompt_caching_converse
that takes in: response, vendor, model, metric_params
Taking in response
instead of headers
, since I'm not seeing any of the headers that are specified in the existing Instrumentation ( x-amzn-bedrock-cache-{read,write}-input-token-count
) in the responses from Bedrock. What I see in responses for Bedrock Converse is within the response body (?) in a usage_metadata
field with value {'input_tokens': 3, 'output_tokens': 492, 'total_tokens': 6042, 'input_token_details': {'cache_creation': 5547, 'cache_read': 0}}
.
Example response from `llm.invoke(input=messages)`
content="# OpenTelemetry Logging - Summary\n\n## Introduction and Philosophy\n- OpenTelemetry's approach to logs differs from its approach to metrics and traces\n- Instead of creating entirely new logging systems, OpenTelemetry embraces existing logging solutions\n- Focus is on integrating logs with other observability signals (traces and metrics)\n\n## Key Problems Solved\n- Traditional logging solutions lack standardized integration with traces and metrics\n- No standardized way to include origin/source information in logs\n- Logs often lack context propagation in distributed systems\n- Different collection agents and protocols create fragmented observability data\n\n## OpenTelemetry's Solution\n- Defines a standard log data model for consistent representation\n- Enables correlation between logs, traces, and metrics\n- Supports existing log formats through mapping to the OpenTelemetry model\n- Provides a Logs API for emitting LogRecords\n- Offers SDK implementation for processing and exporting logs\n\n## Log Correlation Dimensions\n- **Time of execution**: Basic correlation by timestamp\n- **Execution context**: Including trace and span IDs in logs\n- **Origin of telemetry**: Including resource context in logs\n\n## Log Sources and Collection Approaches\n- **System Logs**: OS-generated logs that can be enriched with resource context\n- **Infrastructure Logs**: From components like Kubernetes, can be enriched with resource context\n- **Third-party Application Logs**: Various formats that can be parsed and enriched\n- **Legacy First-Party Applications**:\n - Via File/Stdout: Collected using file log receivers or agents\n - Direct to Collector: Modified to output via network protocols like OTLP\n- **New First-Party Applications**: Can fully implement OpenTelemetry's logging approach\n\n## OpenTelemetry Collector Features\n- Support for log data types and pipelines\n- Ability to read and tail log files\n- Log parsing capabilities for common formats\n- Network protocol support for receiving and sending logs\n- Enrichment processors for adding context\n\n## Auto-Instrumentation Capabilities\n- Can configure popular logging libraries to include trace context\n- Reads incoming trace context\n- Includes trace and span IDs in logged statements\n- Can optionally send logs directly via OTLP\n\n## Key Components\n- Log Data Model: Common understanding of what a LogRecord is\n- Logs API: For emitting LogRecords\n- SDK: Implementation enabling configuration of processing and exporting\n- Collector: For collecting, processing, and exporting logs" additional_kwargs={} response_metadata={'ResponseMetadata': {'RequestId': '98284356-db8f-446e-b078-81eec8a99fc5', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Fri, 22 Aug 2025 14:05:38 GMT', 'content-type': 'application/json', 'content-length': '2871', 'x-amzn-requestid': '98284356-db8f-446e-b078-81eec8a99fc5', 'cache-control': 'proxy-revalidate', 'connection': 'keep-alive'}, 'RetryAttempts': 0}, 'stopReason': 'end_turn', 'metrics': {'latencyMs': [15525]}, 'model_name': 'us.anthropic.claude-3-7-sonnet-20250219-v1:0'} id='run--62f6d50c-4d6a-4ad5-a8f0-bff5e8155cba-0' usage_metadata={'input_tokens': 3, 'output_tokens': 553, 'total_tokens': 6103, 'input_token_details': {'cache_creation': 0, 'cache_read': 5547}}
Update __init__.py
Import the new prompt_caching_converse
function
Call the prompt_caching_converse
function within the following functions:
_handle_converse
_handle_converse_stream
🔄️ Additional Information
I'm not a Python developer, but am willing to give this feature an attempt.
Related:
- 🚀 Feature: Callback Hooks for LLM Instrumentation Libraries #2813 "Bedrock converse api instrumentor does not support prompt caching tracking yet"
- 🚀 Feature: add prompt caching tokens to OpenAI instrumentation #2819
- 🚀 Feature: Support Prompt Caching #1838 -> Feature: Support Prompt Caching #1858 (Anthropic)
👀 Have you spent some time to check if this feature request has been raised before?
- I checked and didn't find similar issue
Are you willing to submit PR?
Yes I am willing to submit a PR!