Glossary term
Glossary term
Memory and Retrieval
Ingestion is the process of importing external data like documents, PDFs, or knowledge base articles into an AI system, making content searchable, retrievable, and usable in conversations or workflows.
Unstructured.io is a popular open-source ingestion library that handles PDFs, HTML, Word, and PowerPoint.
LlamaParse and Reducto provide LLM-based document ingestion with table and chart understanding.
Databricks Delta Live Tables and Snowpark are used for ingesting data into Lakehouse architectures for AI.