Glossary term
Glossary term
Memory and Retrieval
Temporary context used within a session or task.
Every LLM conversation window is an instance of short-term memory - GPT-4 Turbo's 128k context window holds the full conversation and tool outputs for a session, discarding everything when the session ends.
A customer-service bot at Zendesk holds short-term memory of the current ticket thread - prior messages, retrieved KB articles, and tool outputs - to avoid repeating questions already answered in the same chat.
GitHub Copilot Chat maintains short-term memory of the files open in the editor and the last 20 messages, giving the model local code context without requiring the developer to re-paste snippets each turn.