usage.cache_* always 0 and system prompt not cached with moderate strategy

**Title:** `usage.cache_*` always `0` and system prompt not cached with `moderate` strategy (Docker v1.0.1)

<img width="1697" height="881" alt="Image" src="https://github.com/user-attachments/assets/54a58fab-8851-4e59-b90c-c34ecd8ee238" />

**Description**

First of all—great project, exactly what we need for Anthropic caching.
I’ve run into a bug where cost and caching details are **always reported as `0`** in responses, and the **system prompt is not being cached**, even with the default `moderate` strategy.

You can see this in the attached screenshot (Postman). The response’s `usage` block shows:

```json
"usage": {
  "input_tokens": 24851,
  "cache_creation_input_tokens": 0,
  "cache_read_input_tokens": 0,
  "output_tokens": 49
}
```

I would expect non-zero values on the first call (`cache_creation_input_tokens` > 0) and subsequent calls (`cache_read_input_tokens` > 0), because the request contains a very large **system** prompt (thousands of tokens) and the proxy advertises caching for system prompts in `moderate` mode.

Additionally, the “ROI / cache” response metadata (headers) either doesn’t appear or indicates zeroed/out values, suggesting no cache was actually applied.

---

### Expected behavior

* With `CACHE_STRATEGY=moderate` and a large system prompt:

  * First request should produce non-zero `usage.cache_creation_input_tokens`.
  * Subsequent identical requests should produce non-zero `usage.cache_read_input_tokens`.
  * Autocache ROI/caching headers should show meaningful, non-zero values (e.g., cached token counts, ratio, breakpoints including `system`).

### Actual behavior

* `usage.cache_creation_input_tokens` and `usage.cache_read_input_tokens` are always `0`.
* System prompt appears **not** to be cached (no improvement on subsequent requests).
* ROI/caching headers are missing or effectively zeroed.

Screenshot attached shows the response with zeros for all cache-related fields.

---

### Steps to reproduce

1. Run Autocache via Docker (no API key in env; pass per request):

   ```bash
   docker run -d -p 8090:8090 --name autocache ghcr.io/montevive/autocache:latest
   ```

   Health check:

   ```bash
   curl http://localhost:8090/health
   # {"status":"healthy","version":"1.0.1","strategy":"moderate"}
   ```

2. Send a messages request with a **large system prompt** (thousands of tokens) and a tiny user message:

   ```bash
   curl http://localhost:8090/v1/messages \
     -H "Content-Type: application/json" \
     -H "anthropic-version: 2023-06-01" \
     -H "x-api-key: sk-ant-<redacted>" \
     -d '{
       "model": "claude-3-5-sonnet-20241022",
       "max_tokens": 64,
       "system": "<very long system prompt here — 5k+ tokens>",
       "messages": [
         {"role": "user", "content": "Ping"}
       ],
       "stream": false
     }'
   ```

3. Repeat the exact same request 2–3 times.

4. Observe that the response `usage.cache_creation_input_tokens` and `usage.cache_read_input_tokens` remain `0`, and caching/ROI headers don’t reflect any caching.

---

### Environment

* **Autocache image:** `ghcr.io/montevive/autocache:latest` (reports `version: "1.0.1"`, `strategy: "moderate"`)
* **Host OS:** Debian 12
* **Client:** Postman and `curl`
* **Endpoint used:** `POST /v1/messages`
* **Headers sent:** `Content-Type: application/json`, `anthropic-version: 2023-06-01`, `x-api-key: <Anthropic key>`
* **Networking:** direct (no additional proxies in between)

---

### Notes / hypotheses

* If Autocache injects `cache_control` into **system** prompts, maybe the injection doesn’t trigger when the **top-level `system` string** is used (as opposed to a `messages[0].content[..]` text block)? The docs say system prompts are supported, but this path might be skipping injection.
* If injection succeeds, perhaps the proxy is not correctly surfacing Anthropic’s usage fields from the downstream response (mapping/parsing issue).
* Behavior is the same for first and subsequent identical requests (so it doesn’t look like a one-off miss).

---

Thanks a lot—happy to test a fix or a debug build.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

usage.cache_* always 0 and system prompt not cached with moderate strategy #4

Expected behavior

Actual behavior

Steps to reproduce

Environment

Notes / hypotheses

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

usage.cache_* always 0 and system prompt not cached with moderate strategy #4

Description

Expected behavior

Actual behavior

Steps to reproduce

Environment

Notes / hypotheses

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions