Cutting Claude Code Token Costs with Optical Context Compression
Reduce input token counts by converting verbose text context, schemas, and system prompts into compact PNG images. A local proxy intercepts requests to Claude Code, squeezing dense text down to a fraction of its original token cost.
Impact: High
Why it matters
As context windows grow, token costs for repetitive, dense developer data can skyrocket. Using vision capabilities to read compressed text representation is a highly creative way to bypass pricing limits.
TL;DR
- 01pxpipe compresses text context into compact PNGs to leverage the fixed token cost of images.
- 02The approach works best with dense data like code, JSON, and system prompts, reducing token count by up to 90%.
- 03It is inherently lossy; byte-exact data like cryptographic keys or specific IDs must remain as plain text.
- 04Requests can be routed to non-Fable models like Claude Sonnet 4.6 to keep them as plain text.
Squeezing Text Into Images
The pxpipe tool acts as a local proxy running on 127.0.0.1:47821. When you route your Claude Code requests through it by setting ANTHROPIC_BASE_URL, it intercepts the payload. Any token-dense content—like extensive system prompts, tool documentation, or long file histories—is minified and rendered into a highly compact PNG image. The vision-equipped model (such as Fable 5 or GPT-5.6) then reads the rendered image instead of the raw text.
Drastic Cost and Token Reductions
An image's token cost is fixed by its pixel dimensions, regardless of how much text is packed inside. In standard developer workloads, dense text contains about 1 character per token, but the image conversion squeezes this to ~3.1 characters per image token. In benchmark sessions, this optimization reduced an incoming payload of 25,000 text tokens down to just 2,700 image tokens. Over an entire session, this translates to a 59% to 70% lower API bill.
The Lossy Catch
Because this approach is inherently lossy, it must be treated as a "gist tier" rather than exact storage. In rigorous needle-in-a-haystack testing, exact 12-character hexadecimal strings returned a 13/15 recall on Fable 5 and 0/15 on Opus. Any byte-exact requirements—such as cryptographic keys, database IDs, or precise numerical calculations—must remain as plain text. Developers can route these tasks to a subagent on a non-Fable model (such as claude-sonnet-4-6) to bypass the imaging proxy's conversion, passing them through as text instead.
✓ When to use
- When running massive, repetitive text tasks through Claude Code.
- For system prompts and tools documentation that rarely change but consume many tokens.