Do I need to change my code?

Only the base URL and one header. Your prompts, streaming, tool calls, and response handling stay identical — Memory Router speaks the same API as your provider.

Which providers are supported?

OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, OpenRouter, and any OpenAI-compatible endpoint. The OpenAI Assistants API is not yet supported.

What is the difference between BYOK and MemoryLake-hosted?

BYOK (Bring Your Own Key) means you supply your own provider key plus a MemoryLake key — billing and rate limits stay with your provider account. MemoryLake-hosted needs only a MemoryLake key: we run the major models for you, so you skip provider sign-up entirely.

In BYOK mode, is my provider key safe?

Yes. Your provider key is encrypted in transit and forwarded to the provider on each call. MemoryLake never stores, logs, or reuses it — it is passthrough only.

What happens if MemoryLake is down?

In BYOK mode Memory Router fails open: the request passes straight through to your provider so your application keeps working with zero downtime.

How does it reduce tokens?

Instead of replaying the entire history each turn, the Router removes redundant context and injects only the relevant memories — fewer tokens per call as conversations grow.

Is there a free plan?

Yes. Memory Router is available on the Free tier so you can integrate and test before scaling up.

Memory Router · 任意 LLM 的即插即用记忆代理

改一个 URL，就给任意 LLM 加上持久记忆

Memory Router 是一个透明代理，位于你的应用与模型之间。把你现有的 SDK 指向 MemoryLake，每一段对话便自动获得长期记忆与经过优化的上下文窗口。两种模式任选：自带厂商 Key（BYOK），或使用 MemoryLake 内置托管的模型——只需一个 MemoryLake Key。

获取你的 Router 密钥查看文档 →兼容你已在使用的 OpenAI、Anthropic 与 Google SDK

问题所在

无状态 API 让你每次都得重建记忆

每一次 LLM 调用都是无状态的。为了伪造连续性，你不得不在每一轮把整段历史重新发送一遍——既慢又贵，最终还会撑爆上下文窗口。自己接一套向量库和检索管线确实能解决，但那是几周需要你自己搭建和维护的管道工程。

没有记忆层时

每次调用都重发整段对话历史——token 成本随对话变长而攀升。
长会话撞上上下文窗口上限，在任务进行到一半时开始截断。
记忆只存在于单个应用里——一换模型或会话，上下文就没了。

自己从头搭建

部署向量库、向量化管线、分块逻辑和检索逻辑。
编写抽取、去重与相关性排序——然后持续调优。
在你支持的每个厂商、每个模型上都得维护一遍。

Memory Router 把这一切收敛为一次 base URL 的改动。代理本身，就是记忆层。

工作原理

四步实现的透明代理

你的应用

OpenAI / Anthropic / Google SDK

请求

Memory Router

透明代理

裁剪冗余历史注入相关记忆

增强后的请求

模型

BYOK 或托管

你的厂商 · BYOKMemoryLake 托管

记忆存储· 异步读写

一旦 MemoryLake 不可用，请求会直接透传给你的厂商——零停机。

拦截

你的应用把请求发给 Memory Router 而非厂商——相同的载荷、相同的 SDK、相同的响应结构。

优化上下文

Router 裁剪冗余历史、检索既有记忆，只把相关上下文注入到 prompt 中。

增强后的请求被送往模型——你自己的厂商（BYOK）或 MemoryLake 托管的模型。输入 token 比原样重放更少。

记住

新记忆在后台异步抽取与存储——响应永远不会被拖慢。

两种接入方式

BYOK 或 MemoryLake 托管——由你决定

与那些强制你自带 Key 的代理不同，Memory Router 两种方式都支持。无论哪种，改动都只是 base URL；其余一切——prompt、流式输出、工具调用——保持不变。

BYOK

自带 Key

使用你自己的厂商账户。你的厂商 Key 在传输中加密，按次转发给厂商，绝不存储在我们的服务器上。

沿用你现有的 OpenAI / Anthropic / Google 账户。
你的 Key、你的计费、你的速率限制。
Key 仅加密透传——绝不落盘、绝不记录日志。

所需 Key：你的厂商 Key + MemoryLake Key

无需 Key

MemoryLake 托管

无需厂商账户。MemoryLake 为你内置运行主流模型，只需一个 MemoryLake API Key 即可起步。

一个 Key 搞定一切——无需再注册任何其他账户。
主流模型内置，开箱即调。
带记忆上线的最简单方式。

所需 Key：仅 MemoryLake Key

Key 安全，设计使然：BYOK 模式下，你的厂商 Key 在传输中加密，每次调用直接透传给厂商。MemoryLake 绝不存储、记录或复用它。

BYOK —— 你的厂商 Key

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://router.memorylake.ai/v1/openai",
  apiKey: process.env.OPENAI_API_KEY,        // your provider key
  defaultHeaders: {
    // encrypted in transit · passthrough only · never stored
    "x-memorylake-api-key": process.env.MEMORYLAKE_API_KEY,
  },
});

MemoryLake 托管 —— 一个 Key

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://router.memorylake.ai/v1",
  apiKey: process.env.MEMORYLAKE_API_KEY,    // one key — that's it
});

// Pick any built-in model, e.g. "claude-opus-4-8" or "gpt-5".

你将获得

无需自建的记忆基础设施

一行集成

改掉 base URL 即可。你的 SDK 和代码原封不动。

BYOK 或托管

自带厂商 Key（加密、不保存），或使用 MemoryLake 托管模型——只需一个 Key。

自动上下文优化

移除冗余历史、只注入相关记忆，降低每次调用的 token 量。

共享记忆池

Router 与 MemoryLake API 读写同一份记忆——唯一的真相来源。

优雅降级

一旦 MemoryLake 不可用，请求会直接透传给你的厂商。零停机。

完整可观测

响应头报告会话 ID、上下文是否被改动、token 数量以及创建或召回的记忆。

兼容性

兼容你已在使用的厂商

BYOK 模式下，沿用你的厂商账户和密钥；托管模式下，MemoryLake 为你运行这些模型——两种方式共用同一记忆层。

厂商	状态
OpenAI / GPT	完全支持
Anthropic / Claude	完全支持
Google Gemini	完全支持
Groq、DeepSeek、OpenRouter	完全支持
任何 OpenAI 兼容端点	支持
OpenAI Assistants API	暂不支持

透明性

每个响应都会告诉你发生了什么

Memory Router 返回诊断性响应头，让你清楚看到每个请求是如何被处理的——绝非黑箱。

会话 ID

该请求被归属到的对话线程，便于你分组与检视各轮对话。

上下文是否改动

本次调用是否注入了记忆或裁剪了历史。

Token 数量

优化后实际发送了多少 token，对比原样重放的差异。

触及的记忆

召回了多少记忆片段、又新建了多少。

上手

三步上线

1
获取 MemoryLake Key
注册 MemoryLake 并创建一个 API Key。Free 套餐自带记忆存储，可直接起步。
2
选择模式 + 替换 base URL
把 SDK 指向 Router 端点。BYOK：保留厂商 Key，并把 MemoryLake Key 作为 header 加上；托管：只用 MemoryLake Key 即可。
3
照常调用
像今天一样发送请求。记忆会被自动召回与存储；读响应头即可确认。

差异

直连 API vs. Memory Router

随对话变长，每次调用发送的 Token

约省 90% Token

直连调用每轮都发整段历史

Memory Router只发相关记忆

	直连厂商调用	使用 Memory Router
长期记忆	需要你自建并托管	内置、自动
上下文窗口	全部重发，然后截断	已优化——只发重要的
Key 与账户	必须有厂商账户	BYOK 或仅用一个 MemoryLake Key
代码改动	新 SDK + 检索管线	一次 base URL 改动
跨会话与跨模型	记忆按应用各自孤立	共享记忆池
记忆层厂商故障	需要你自己兜底	优雅透传
可见性	默认没有	诊断性响应头

常见问题

我需要改动代码吗？

只需改 base URL 和一个 header。你的 prompt、流式输出、工具调用和响应处理都保持一致——Memory Router 说的是与你厂商相同的 API。

支持哪些厂商？

OpenAI、Anthropic、Google Gemini、Groq、DeepSeek、OpenRouter，以及任何 OpenAI 兼容端点。OpenAI Assistants API 暂不支持。

BYOK 和 MemoryLake 托管有什么区别？

BYOK（自带 Key）指你提供自己的厂商 Key 加上一个 MemoryLake Key——计费和速率限制仍归属你的厂商账户。托管模式只需一个 MemoryLake Key：我们为你运行主流模型，完全省去厂商注册。

BYOK 模式下，我的厂商 Key 安全吗？

安全。你的厂商 Key 在传输中加密，每次调用直接转发给厂商。MemoryLake 绝不存储、记录或复用它——仅作透传。

如果 MemoryLake 宕机会怎样？

BYOK 模式下 Memory Router 采用故障开放：请求会直接透传给你的厂商，应用继续运行，零停机。

它如何减少 token？

它不再每轮重放整段历史，而是移除冗余上下文、只注入相关记忆——随着对话变长，每次调用的 token 更少。

记忆与 MemoryLake API 是共享的吗？

是的。Router 与 MemoryLake API 操作同一份记忆池，所以一边存入的内容，另一边都能召回。

有免费套餐吗？

有。Memory Router 在 Free 套餐即可使用，方便你在扩量前先集成与测试。

给每个 LLM 一份记忆——只改一个 URL。

别再重发上下文、别再重建检索。把你的 SDK 指向 Memory Router，今天就交付记忆。

获取你的 Router 密钥查看文档 →

改一个 URL，就给任意 LLM 加上持久记忆

无状态 API 让你每次都得重建记忆

没有记忆层时

自己从头搭建

四步实现的透明代理

拦截

优化上下文

转发

记住

BYOK 或 MemoryLake 托管——由你决定

自带 Key

MemoryLake 托管

无需自建的记忆基础设施

一行集成

BYOK 或托管

自动上下文优化

共享记忆池

优雅降级

完整可观测

兼容你已在使用的厂商

每个响应都会告诉你发生了什么

会话 ID

上下文是否改动

Token 数量

触及的记忆

三步上线

获取 MemoryLake Key

选择模式 + 替换 base URL

照常调用

直连 API vs. Memory Router

随对话变长，每次调用发送的 Token

常见问题

给每个 LLM 一份记忆——只改一个 URL。