Skip to content

fix: stop evicting message-level cache breakpoints for Anthropic Messages API#4410

Merged
bhavyaus merged 2 commits intomainfrom
dev/bhavyau/fix-cache-control-eviction
Mar 14, 2026
Merged

fix: stop evicting message-level cache breakpoints for Anthropic Messages API#4410
bhavyaus merged 2 commits intomainfrom
dev/bhavyau/fix-cache-control-eviction

Conversation

@bhavyaus
Copy link
Contributor

Cache rates for claude-opus-4.6-1m dropped dramatically because addToolsAndSystemCacheControl was evicting message-level cache_control entries to make room for tools+system breakpoints. Message breakpoints are more valuable in long agent conversations — they implicitly cache the tools+system prefix (Anthropic hierarchy: tools → system → messages) and maintain the 20-block lookback chain. Now tools+system breakpoints are only added when spare slots exist, never evicting message breakpoints.

Copilot AI review requested due to automatic review settings March 14, 2026 03:51
@bhavyaus bhavyaus enabled auto-merge March 14, 2026 03:52
…emCacheControl

Message-level cache_control entries are more valuable than tools/system
breakpoints for long conversations because they implicitly cache the
tools+system prefix (Anthropic cache hierarchy: tools → system → messages)
and maintain the 20-block lookback chain needed for cache hits in long
agent sessions.

Previously, the function would evict earliest message breakpoints to make
room for tools+system breakpoints, breaking cache coverage for
claude-opus-4.6-1m conversations with many tool calls.

Now tools+system breakpoints are only added when spare slots exist
(existing < 4), never evicting message breakpoints.
@bhavyaus bhavyaus force-pushed the dev/bhavyau/fix-cache-control-eviction branch from 1de96e2 to c8bc699 Compare March 14, 2026 03:53
@vs-code-engineering vs-code-engineering bot added this to the 1.112.0 milestone Mar 14, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts Anthropic Messages API cache breakpoint handling to preserve message-level cache_control entries (which are more valuable for long agent conversations), preventing cache-rate regressions on long-context Claude models.

Changes:

  • Update addToolsAndSystemCacheControl to only add tool/system breakpoints when there are spare slots, without evicting message-level breakpoints.
  • Update and expand unit tests to cover “spare slot” vs “no spare slots” scenarios and ensure message breakpoints are preserved.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/platform/endpoint/node/messagesApi.ts Changes cache-control slot allocation logic to avoid evicting message-level breakpoints and to only add tool/system breakpoints when capacity remains.
src/platform/endpoint/test/node/messagesApi.spec.ts Updates/extends tests to validate the new non-evicting behavior and slot-based tool/system prioritization.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@bhavyaus bhavyaus added this pull request to the merge queue Mar 14, 2026
Merged via the queue into main with commit 8b7dacb Mar 14, 2026
19 checks passed
@bhavyaus bhavyaus deleted the dev/bhavyau/fix-cache-control-eviction branch March 14, 2026 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants