Skip to content

Conversation

caozhiyuan
Copy link

@caozhiyuan caozhiyuan commented Sep 24, 2025

This pull request introduces major improvements to token counting and model compatibility for Anthropic and OpenAI-style chat and tool calls. The most significant changes include a complete rewrite of the token counting logic to support multiple model tokenizers, improved handling of tool tokens, and new endpoints for Anthropic-compatible token counting. Several translation functions now properly account for cached tokens, ensuring more accurate usage reporting.

Token counting and model compatibility improvements:

  • Rewrote src/lib/tokenizer.ts to support multiple GPT encoding schemes, more accurate token counting for messages and tools, and dynamic model-based constants. The new implementation allows for flexible token calculation across Anthropic and OpenAI models, including tool call and parameter support.

Anthropic endpoint and translation enhancements:

  • Added a new /v1/messages/count_tokens endpoint, with handler handleCountTokens in src/routes/messages/count-tokens-handler.ts, to accurately compute input tokens for Anthropic requests, including model-specific adjustments and tool bonuses. [1] [2] [3] [4]
  • Updated translation logic in non-stream-translation.ts and stream-translation.ts to subtract cached tokens from input token counts and include detailed cache token usage in the response, improving Anthropic usage reporting. [1] [2] [3]

Model and payload type improvements:

  • Expanded the Model interface in src/services/copilot/get-models.ts to support new capabilities and ensure compatibility with the updated tokenizer.
  • Enhanced the ChatCompletionResponse type in src/services/copilot/create-chat-completions.ts to include prompt_tokens_details, allowing reporting of cached token usage.

Configuration and setup updates:

  • Updated configuration in README.md and src/start.ts to include new environment variables for model selection and non-essential traffic disabling, supporting the expanded Anthropic model options. [1] [2]

caozhiyuan and others added 4 commits September 24, 2025 16:03
- Add new command line option `--claude-code-env` to generate environment variables for Claude Code
- Update `runServer` function to use new `claudeCodeEnv` option
- Add `claude-code-env` to `start` command options in README.md
@caozhiyuan
Copy link
Author

when message_start miss usage input_tokens, WebFetch will get erorr API Error: Cannot read properties of undefined (reading 'input_tokens') . so , force-pushed to revoke lastest commit

@caozhiyuan caozhiyuan changed the title feature about anthropic count token and fix token usage problem feature about anthropic count token and fix token usage problem and update claude code settings Sep 30, 2025
"ANTHROPIC_MODEL": "gpt-4.1",
"ANTHROPIC_SMALL_FAST_MODEL": "gpt-4.1"
"ANTHROPIC_DEFAULT_SONNET_MODEL": "gpt-4.1",
"ANTHROPIC_SMALL_FAST_MODEL": "gpt-4.1",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can probably be removed here +corresponding start.ts, anthropic have deprecated in favor of ANTHROPIC_DEFAULT_HAIKU_MODEL which you have now also included,
I think can also close #89 then, since original question and followup comment are addressed by this PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ANTHROPIC_SMALL_FAST_MODEL For compatibility with lower versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants