When I asked ChatGPT what it remembered about me, it listed 33 facts from my name and career goals to my current fitness routine. But how does it actually store and retrieve this information? And why does it feel so seamless?
当我询问 ChatGPT 它记得我什么时,它列出了从我的名字和职业目标到我当前健身计划的 33 个事实。但它是如何实际存储和检索这些信息的?为什么它感觉如此流畅?
After extensive experimentation, I discovered that ChatGPT’s memory system is far simpler than I expected. No vector databases. No RAG over conversation history. Instead, it uses four distinct layers: session metadata that adapts to your environment, explicit facts stored long-term, lightweight summaries of recent chats, and a sliding window of your current conversation.
经过大量实验,我发现 ChatGPT 的记忆系统远比我预期的要简单。没有向量数据库。没有基于对话历史的 RAG 技术。相反,它使用了四个不同的层级:适应你环境的会话元数据、长期存储的明确事实、近期对话的轻量级摘要,以及当前对话的滑动窗口。
This blog breaks down exactly how each layer works and why this approach might be superior to traditional retrieval systems. Everything here comes from reverse engineering ChatGPT’s behavior through conversation. OpenAI did not publish these implementation details.
这篇博客详细解析了每一层的工作原理,以及为什么这种方法可能优于传统的检索系统。这里的所有内容都是通过对话对 ChatGPT 的行为进行逆向工程得出的。OpenAI 并未发布这些实现细节。
ChatGPT’s Context Structure
ChatGPT 的上下文结构
Before understanding memory, it’s important to understand the entire context ChatGPT receives for every message. The structure looks like this:
在理解记忆系统之前,了解 ChatGPT 在每条消息中接收到的完整上下文结构是很重要的。结构如下:
[0] System Instructions
[1] Developer Instructions
[2] Session Metadata (ephemeral)
[3] User Memory (long-term facts)
[4] Recent Conversations Summary (past chats, titles + snippets)
[5] Current Session Messages (this chat)
[6] Your latest message
The first two components define high-level behavior and safety rules. They aren’t the focus of this blog. The interesting pieces begin with Session Metadata.
前两个组件定义了高级行为和安全规则。它们不是本文的重点。有趣的部分从会话元数据开始。
Session Metadata 会话元数据
These details are injected once at the beginning of a session. They are not stored permanently and don’t become part of long-term memory. This block includes:
这些信息在会话开始时一次性注入。它们不会被永久存储,也不会成为长期记忆的一部分。此区块包括:
- Device type (desktop/mobile)
设备类型(桌面/移动设备) - Browser + user agent
浏览器 + 用户代理 - Rough location/timezone 大致位置/时区
- Subscription level 订阅等级
- Usage patterns and activity frequency
使用模式和活动频率 - Recent model usage distribution
近期模型使用分布 - Screen size, dark mode status, JS enabled, etc.
屏幕尺寸,暗色模式状态,JS 是否启用等。
An example of session metadata is:
会话元数据的一个例子是:
Session Metadata:
- User subscription: ChatGPT Go
- Device: Desktop browser
- Browser user-agent: Chrome on macOS (Intel)
- Approximate location: India (may be VPN)
- Local time: ~16:00
- Account age: ~157 weeks
- Recent activity:
- Active 1 day in the last 1
- Active 5 days in the last 7
- Active 18 days in the last 30
- Conversation patterns:
- Average conversation depth: ~14.8 messages
- Average user message length: ~4057 characters
- Model usage distribution:
* 5% gpt-5.1
* 49% gpt-5
* 17% gpt-4o
* 6% gpt-5-a-t-mini
* etc.
- Device environment:
- JS enabled
- Dark mode enabled
- Screen size: 900×1440
- Page viewport: 812×1440
- Device pixel ratio: 2.0
- Session duration so far: ~1100 seconds
This information helps the model tailor responses to your environment, but none of it persists after the session ends.
此信息有助于模型根据您的环境定制回复,但会话结束后这些信息将不再保留。
User Memory 用户记忆
ChatGPT has a dedicated tool for storing and deleting stable, long-term facts about the user. These are the pieces that accumulate over weeks and months to form a persistent “profile.”
ChatGPT 有一个专门的工具用于存储和删除关于用户的稳定、长期事实。这些信息会随着时间的推移积累数周甚至数月,形成一个持久的“档案”。
In my case, the model had 33 stored facts — things like:
在我这里,模型存储了 33 条事实信息 —— 例如:
- My name, age 我的名字、年龄
- Career goals 职业目标
- Background and past roles
背景和过往职位 - Current projects 当前项目
- Areas I am studying
我正在研究的领域 - Fitness routine 健身计划
- Personal preferences 个人偏好
- Long-term interests 长期兴趣
These are not guessed; they are explicitly stored only when:
这些不是猜测的;它们仅在以下情况时被显式存储:
- The user says “remember this” or “store this in memory”, or
用户说“记住这个”或“把这个存储在记忆中”,或 - The model detects a fact that fits OpenAI’s criteria (like your name, job title, or stated preferences) and the user implicitly agrees through conversation
模型检测到符合 OpenAI 标准的事实(如你的名字、职位或陈述的偏好),并通过对话让用户隐式同意
These memories are injected into every future prompt as a separate block.
这些记忆会被注入到每一个未来的提示中,作为一个独立的块。
If you want to add or remove anything, you can simply say:
如果你想添加或删除任何内容,只需简单地说:
- “Store this in memory…”
“将此存储在内存中…” - “Delete this from memory…”
“从内存中删除此…”
Example: 示例:
- User's name is Manthan Gupta.
- Previously worked at Merkle Science and Qoohoo (YC W23).
- Prefers learning through a mix of videos, papers, and hands-on work.
- Built TigerDB, CricLang, Load Balancer, FitMe.
- Studying modern IR systems (LDA, BM25, hybrid, dense embeddings, FAISS, RRF, LLM reranking).
Recent Conversations Summary
最近对话摘要
This part surprised me the most, because I expected ChatGPT to use some kind of RAG across past conversations. Instead, it uses a lightweight digest.
这一部分让我最惊讶,因为我想 ChatGPT 会在过去的对话中使用某种 RAG 技术。然而,它实际上使用的是一个轻量级的摘要。
ChatGPT keeps a list of recent conversation summaries in this format:
ChatGPT 以这种格式保存最近对话的摘要列表:
1. <Timestamp>: <Chat Title>
|||| user message snippet ||||
|||| user message snippet ||||
Observations: 观察结果:
- It only summarizes my messages, not the assistant’s.
它只总结我的消息,而不是助手的。 - There were around 15 summaries available.
大约有 15 个总结可供使用。 - They act as a loose map of my recent interests, not detailed context.
它们充当我近期兴趣的松散地图,而不是详细上下文。
This block gives ChatGPT a sense of continuity across chats without pulling in full transcripts.
这个模块使 ChatGPT 在不同对话中保持连贯性,而无需引入完整的对话记录。
Traditional RAG systems would require:
传统 RAG 系统需要:
- Embedding every past message
嵌入每条过去的对话消息 - Running similarity searches on each query
对每个查询运行相似性搜索 - Pulling in full message contexts
拉取完整的消息上下文 - Higher latency and token costs
更高延迟和 token 成本
ChatGPT’s approach is simpler: pre-compute lightweight summaries and inject them directly. This trades detailed context for speed and efficiency.
ChatGPT 的方法更为简单:预先计算轻量级摘要并直接注入。这种方法以牺牲详细上下文为代价,换取速度和效率。
Current Session Messages
当前会话消息
This is the normal sliding window of the present conversation. It contains the full history (not summarized) of all messages exchanged in this session.
这是当前对话的正常滑动窗口。它包含本会话中所有消息的完整历史记录(未进行摘要)。
I wasn’t able to get the exact token limit out of ChatGPT but it did confirm:
我没能从 ChatGPT 中确切地获取到令牌限制,但它确实确认了:
- The cap is based on token count, not number of messages
上限是基于令牌数量,而不是消息数量 - Once the limit is reached, older messages in the current session roll off (but memory facts and conversation summaries remain)
一旦达到上限,当前会话中的旧消息将被移除(但记忆事实和对话摘要仍保留) - Everything in this block is passed verbatim to the model, maintaining full conversational context
这个块中的所有内容都会原样传递给模型,保持完整的对话上下文
This is what allows the assistant to reason coherently within a session.
这使得助手能够在会话中进行连贯的推理。
How It All Works Together
如何协同工作
When you send a message to ChatGPT, here’s what happens:
当你向 ChatGPT 发送一条消息时,会发生以下事情:
- Session starts: Session metadata is injected once, giving ChatGPT context about your device, subscription, and usage patterns.
会话开始:会话元数据被注入一次,为 ChatGPT 提供关于你的设备、订阅和使用模式的上下文信息。 - Every message: Your stored memory facts (33 in my case) are always included, ensuring responses align with your preferences and background.
每条消息:你的存储记忆事实(在我这里共有 33 条)始终会被包含,确保回复符合你的偏好和背景。 - Cross-chat awareness: The recent conversations summary provides a lightweight map of your interests without pulling in full transcripts.
跨对话意识:最近的对话摘要提供了一个轻量级的兴趣地图,而无需引入完整的对话记录。 - Current context: The sliding window of current session messages maintains coherence within the conversation.
当前上下文:当前会话消息的滑动窗口保持对话的连贯性。 - Token budget: As the session grows, older messages roll off, but your memory facts and conversation summaries remain, preserving continuity.
Token 预算:随着会话的进行,较早的消息会被移除,但你的记忆事实和对话摘要会保留下来,以维持连贯性。
This layered approach means ChatGPT can feel personal and context-aware without the computational cost of searching through thousands of past messages.
这种分层方法意味着 ChatGPT 可以在不耗费大量计算资源搜索数千条历史消息的情况下,感受到个性并具备上下文感知能力。
Conclusion 结论
ChatGPT’s memory system is a multi-layered architecture that balances personalization, performance, and token efficiency. By combining ephemeral session metadata, explicit long-term facts, lightweight conversation summaries, and a sliding window of current messages, ChatGPT achieves something remarkable: it feels personal and context-aware without the computational overhead of traditional RAG systems.
ChatGPT 的内存系统是一个多层架构,它在个性化、性能和令牌效率之间取得平衡。通过结合短暂的会话元数据、明确的长期事实、轻量级的对话摘要以及当前消息的滑动窗口,ChatGPT 实现了令人印象深刻的效果:它感觉像是个性化的,并且具有上下文感知能力,而无需传统 RAG 系统的计算开销。
The key insight here is that not everything needs to be “memory” in the traditional sense. Session metadata adapts to your environment in real-time. Explicit facts persist across sessions. Conversation summaries provide continuity without detail. And the current session maintains coherence. Together, these dynamic components—each updated as the session progresses and your preferences evolve—create the illusion of a system that truly knows you.
这里的关键在于,并非所有内容都需要以传统意义上的“记忆”形式存在。会话元数据能够实时适应你的环境。显式事实会在会话之间持续保留。对话摘要提供了连续性而无需细节。当前会话则保持连贯性。这些动态组件共同创造出一种系统真正了解你的错觉,它们随着会话的进行和你偏好的变化而不断更新。
For users, this means ChatGPT can feel increasingly personal over time without requiring explicit knowledge base management. For developers, it’s a lesson in pragmatic engineering: sometimes simpler, more curated approaches outperform complex retrieval systems, especially when you control the entire pipeline.
对用户来说,这意味着 ChatGPT 可以在不进行显式知识库管理的情况下,随着时间的推移感觉越来越个性化。对开发者来说,这是一个务实工程的教训:有时候更简单、更精心设计的方法会优于复杂的检索系统,尤其是在你掌控整个流程时。
The trade-off is clear: ChatGPT sacrifices detailed historical context for speed and efficiency. But for most conversations, that’s exactly the right balance. The system remembers what matters (your preferences, goals, and recent interests) while staying fast and responsive.
权衡是明确的:ChatGPT 为了速度和效率牺牲了详细的上下文历史。但对于大多数对话来说,这正是恰到好处的平衡。系统会记住重要的内容(你的偏好、目标和近期兴趣),同时保持快速和响应性。
This blog is based on experimentation and reverse engineering through conversation, not official documentation—so take it with a grain of salt. If you found this interesting, I’d love to hear your thoughts. Share it on Twitter, LinkedIn, or Peerlist, or reach out at guptaamanthan01[at]gmail[dot]com.
这篇博客是基于通过对话进行的实验和逆向工程,而非官方文档——所以请带着批判的眼光阅读。如果你觉得这很有趣,我很想听听你的想法。在 Twitter、LinkedIn 或 Peerlist 上分享,或通过 guptaamanthan01[at]gmail[dot]com 与我联系。
原文出处 I Reverse Engineered ChatGPT’s Memory System, and Here’s What I Found!