Real Project Data

Your AI Agent Is Wasting Tokens

Every session, your AI tools load entire CLAUDE.md files, memory indexes, and referenced documentation into the context window. Most of it is irrelevant. That is context bloat.

The Context Bloat Problem

Local .md files grow unbounded as your project evolves. Every AI session loads everything — project instructions, memory indexes, referenced files — regardless of what the current task actually needs.

The result: thousands of irrelevant tokens consuming your context window, displacing the actual conversation, and increasing costs with every API call.

Local Files

Load everything, every time. No filtering. Context grows as project grows.

Memra API

Recall only what is relevant. Semantic search returns the top results for your current task.

The Numbers

Measured on a real production project with 40+ memory files and growing documentation.

94%

Token Reduction

1,050

Tokens per Recall (Memra)

18,200

Tokens per Session (Local Files)

17x

More Efficient

Metric Local .md Files Memra API
Per-session context load 18,200 tokens 1,050 tokens
Scales with project size Grows unbounded Fixed (top_k)
Relevance filtering None (everything loaded) Semantic search
Cross-project reuse Copy files manually Shared via API

Methodology

Data collected from a real production project using Memra during its own development. Token estimates use ~4 characters per token (cl100k_base encoding).

A Local Files (Approach A)

  • CLAUDE.md project instructions: ~6,200 tokens
  • MEMORY.md index file: ~2,000 tokens
  • 40 referenced memory files: ~10,000 tokens
  • Total per session: ~18,200 tokens

B Memra API (Approach B)

  • Search query: ~50 tokens
  • Top 5 recall results: ~1,000 tokens
  • Total per recall: ~1,050 tokens

How It Works

Instead of dumping entire files into every session, Memra lets your agent recall only what matters.

1. Store

Your agent stores memories through the API. Each memory gets an embedding for semantic search.

2. Search

Semantic search finds the most relevant memories for the current task. No keyword matching — meaning-based retrieval.

3. Recall

Only the top results enter your context window. Fixed token cost regardless of how much your project grows.

Stop Wasting Tokens. Start Recalling.

Give your AI agent a memory that scales. Free tier available — no credit card required.