Memra TypeScript SDK — Quickstart

Name: Memra
Author: Memra

Get Memra running in a Node, Bun, Deno, Cloudflare Worker, or Vercel Edge project in about 10 minutes. This guide assumes you already have a Memra API key (sign up at usememra.com).

1. Install

npm install @memra/sdk

Node 18+ (or Bun / Deno / Workers). Zero dependencies — uses the platform fetch.

2. Configure auth (one recommended pattern)

Put your key in an environment variable. Build one client at module load and export it.

# .env.local (or your platform's secret store)
MEMRA_API_KEY=memra_live_your_key_here

// lib/memra.ts
import { MemraClient } from '@memra/sdk';

if (!process.env.MEMRA_API_KEY) {
  throw new Error('MEMRA_API_KEY is not set');
}

export const memra = new MemraClient({ apiKey: process.env.MEMRA_API_KEY });

Import memra from this module anywhere in your app. One client per process; the underlying fetch pools connections for you.

On Cloudflare Workers / Vercel Edge, read the key from the environment bindings instead (env.MEMRA_API_KEY) and build the client inside your handler.

3. Hello memory

Store a fact, recall it by meaning, and see how much context it will cost you.

import { memra } from './lib/memra.js';

async function main() {
  // Store a memory
  const memory = await memra.memories.add({
    content: 'Alice prefers dark mode and drinks her coffee black.',
    tenantId: 'user_alice',
    projectId: 'my-app',
    type: 'preference',
    importance: 7,
  });
  console.log(`stored ${memory.id}`);

  // Recall by meaning (reranking is on by default)
  const result = await memra.memories.recall({
    query: 'How does Alice like her coffee?',
    tenantId: 'user_alice',
    projectId: 'my-app',
    rerank: true,
  });

  for (const m of result.data) {
    console.log(`[${m.score.toFixed(3)}] ${m.content}`);
  }

  console.log(`\nestimated_tokens for this recall: ${result.estimated_tokens}`);
}

main().catch(console.error);

What to notice:

tenantId scopes the memory to one user. Always pass it.
rerank: true is the default; passing it explicitly is just a reminder.
result.estimated_tokens (kept snake_case on purpose — it mirrors the wire field) is the total tokens you're about to inject into your LLM prompt if you feed all result.data into it. Read it every call; your context budget depends on it.

4. A realistic example: episodic memory for a chatbot

Store a few conversation turns, then recall the ones relevant to a new question. This is where reranking earns its keep — the top dense-similarity hit is often not the answer you want.

import { memra } from './lib/memra.js';

const TENANT = 'user_alice';
const PROJECT = 'support-bot';

async function main() {
  const turns = [
    'Alice reported her export job failed on 2026-04-10 at 14:02 UTC.',
    'Root cause: she exceeded the 10k-row free-tier limit.',
    'Alice upgraded to the Pro plan the next day and the export succeeded.',
    'Alice asked whether older exports were preserved — yes, retained 90 days.',
  ];

  for (const content of turns) {
    await memra.memories.add({
      content,
      tenantId: TENANT,
      projectId: PROJECT,
      type: 'event',
    });
  }

  // Later: Alice comes back and asks something fuzzy.
  const result = await memra.memories.recall({
    query: 'Why did my last export break and did I fix it?',
    tenantId: TENANT,
    projectId: PROJECT,
    limit: 5,
  });

  console.log(`top hit: ${result.data[0]?.content}`);
  console.log(`returned ${result.meta.returned} of ${result.meta.totalCandidates} candidates`);
  console.log(`tokens to inject: ${result.estimated_tokens}`);
}

main().catch(console.error);

Why reranking matters here: pure vector similarity often surfaces the "older exports preserved?" turn because it has the richest lexical overlap with "export". The reranker reads the actual question — why did it break and did I fix it — and promotes the root-cause + resolution turns to the top.

For bulk ingestion, swap the loop for memra.memories.batch([...]) — up to 100 items in one round-trip.

5. Handling errors

Use instanceof on the typed error classes. Everything descends from MemraError.

import {
  MemraAuthError,
  MemraQuotaError,
  MemraNotFoundError,
} from '@memra/sdk';

try {
  const result = await memra.memories.recall({ query: '...', tenantId: '...', projectId: '...' });
} catch (err) {
  if (err instanceof MemraAuthError) {
    throw new Error('Check MEMRA_API_KEY — it is missing or revoked.');
  }
  if (err instanceof MemraQuotaError) {
    // Rate-limited or plan quota. Back off; don't retry hot.
    return;
  }
  if (err instanceof MemraNotFoundError) {
    // Usually a wrong projectId.
    return;
  }
  throw err;
}

The SDK does not retry on your behalf. For high-throughput workloads, wrap recall/add with your own backoff (p-retry, async-retry, etc.) and respect the Retry-After header on 429s.

Next steps

Multiple users? One tenantId per user. Use memra.projects.create(...) once to carve apps apart.
Types for responses? import type { Memory, RecallResult, RecallParams } from '@memra/sdk';.
Correcting an existing memory? Call memra.memories.supersede(id, { content: '...' }) instead of adding a new one — the old memory stops surfacing in search and the audit trail is preserved. memra.memories.chain(id) returns the full oldest→newest history.
Data export or erasure? memra.privacy.exportData() and memra.privacy.createErasureRequest(id).
Edge runtime? Same code works on Workers/Edge — just build the client inside your handler from env bindings.

Full reference: usememra.com/docs/sdks/typescript.