Skip to content

n8n 中的 RAG#

¥RAG in n8n

什么是 RAG?#

¥What is RAG

检索增强生成 (RAG) 是一种通过将语言模型与外部数据源相结合来改进 AI 响应的技术。RAG 系统并非仅仅依赖模型内部的训练数据,而是会检索与 ground 响应相关的文档,这些文档包含最新的、特定字段的或专有的知识。RAG 工作流通常依赖向量存储来高效地管理和搜索这些外部数据。

¥Retrieval-Augmented Generation (RAG) is a technique that improves AI responses by combining language models with external data sources. Instead of relying solely on the model's internal training data, RAG systems retrieve relevant documents to ground responses in up-to-date, domain-specific, or proprietary knowledge. RAG workflows typically rely on vector stores to manage and search this external data efficiently.

什么是向量存储?#

¥What is a vector store?

矢量商店 是一个专门用于存储和搜索高维向量的特殊数据库:文本、图片或其他数据的数值表示。上传文档时,矢量存储会将其分割成多个块,并使用 嵌入模型 将每个块转换为一个矢量。

¥A vector store is a special database designed to store and search high-dimensional vectors: numerical representations of text, images, or other data. When you upload a document, the vector store splits it into chunks and converts each chunk into a vector using an embedding model.

你可以使用相似性搜索查询这些向量,这种搜索方式基于语义而非关键字匹配来构建结果。这使得向量存储成为 RAG 和其他需要检索和推断大型知识集的 AI 系统的强大基础。

¥You can query these vectors using similarity searches, which construct results based on semantic meaning, rather than keyword matches. This makes vector stores a powerful foundation for RAG and other AI systems that need to retrieve and reason over large sets of knowledge.

如何在 n8n 中使用 RAG#

¥How to use RAG in n8n

Start with a RAG template

👉 在 n8n 中使用 RAG 入门模板 试用 RAG。模板包含两个现成的工作流:一个用于上传文件,一个用于查询文件。

¥👉 Try out RAG in n8n with the RAG Starter Template. The template includes two ready-made workflows: one for uploading files and one for querying them.

插入中将数据导入你的矢量存储#

¥Inserting data into your vector store

你的代理可以访问自定义知识库之前,你需要将数据上传到向量存储:

¥Before your agent can access custom knowledge, you need to upload that data to a vector store:

  1. 添加获取源数据所需的节点。

¥Add the nodes needed to fetch your source data. 2. 插入一个向量存储节点(例如 Simple Vector 存储),然后选择“插入文档”操作。

¥Insert a Vector Store node (e.g. the Simple Vector Store) and choose the Insert Documents operation. 3. 选择一个嵌入模型,将文本转换为矢量嵌入。有关 选择合适的嵌入模型 的更多信息,请参阅常见问题解答。

¥Select an embedding model, which converts your text into vector embeddings. Consult the FAQ for more information on choosing the right embedding model. 4. 添加一个 默认数据加载器 节点,用于将内容分割成块。你可以使用默认设置或定义自己的分块策略:

¥Add a Default Data Loader node, which splits your content into chunks. You can use the default settings or define your own chunking strategy:

  • 字符文本分割器:按字符长度拆分。

    ¥Character Text Splitter: splits by character length.

  • 递归字符文本分割器:递归地按 Markdown、HTML、代码块或简单字符拆分(推荐用于大多数用例)。

    ¥Recursive Character Text Splitter: recursively splits by Markdown, HTML, code blocks or simple characters (recommended for most use cases).

  • 令牌文本分割器:按令牌计数拆分。

    ¥Token Text Splitter: splits by token count. 5. (可选的)为每个数据块添加元数据,以丰富上下文并便于后续筛选。

¥(Optional) Add metadata to each chunk to enrich the context and allow better filtering later.

查询你的数据#

¥Querying your data

你可以通过两种主要方式查询数据:使用代理或直接通过节点。

¥You can query the data in two main ways: using an agent or directly through a node.

使用代理#

¥Using agents

  1. agent 添加到你的工作流。

¥Add an agent to your workflow. 2. 将矢量存储添加为工具,并为其添加描述,以帮助代理了解何时使用它:

¥Add the vector store as a tool and give it a description to help the agent understand when to use it:

  • 设置返回的数据块数量限制。

    ¥Set the limit to define how many chunks to return.

  • 启用“包含元数据”可为每个数据块提供额外的上下文信息。

    ¥Enable Include Metadata to provide extra context for each chunk. 3. 添加与插入数据时相同的嵌入模型。

¥Add the same embedding model you used when inserting the data.

Pro tip

为了在高负载模型上节省令牌,你可以先使用 矢量存储问答工具 检索相关数据,然后再将结果传递给代理。要查看实际效果,请查看 此模板

¥To save tokens on an expensive model, you can first use the Vector Store Question Answer tool to retrieve relevant data, and only then pass the result to the Agent. To see this in action, check out this template.

直接使用节点#

¥Using the node directly

  1. 将你的矢量存储节点添加到画布并选择“获取多个”操作。

¥Add your vector store node to the canvas and choose the Get Many operation. 2. 输入查询或提示:

¥Enter a query or prompt:

  • 设置返回的数据块数量上限。

    ¥Set a limit for how many chunks to return.

  • 如果需要,启用“包含元数据”。

    ¥Enable Include Metadata if needed.

FAQs#

如何选择合适的嵌入模型?#

¥How do I choose the right embedding model?

合适的嵌入模型因具体情况而异。

¥The right embedding model differs from case to case.

通常,较小的模型(例如 text-embedding-ada-002)速度更快、成本更低,因此非常适合处理简短的通用文档或轻量级的 RAG 工作流。更大的模型(例如 text-embedding-3-large)能够提供更好的语义理解。这些示例最适合处理长文档、复杂主题或对准确性要求极高的场景。

¥In general, smaller models (for example, text-embedding-ada-002) are faster and cheaper and thus ideal for short, general-purpose documents or lightweight RAG workflows. Larger models (for example, text-embedding-3-large) offer better semantic understanding. These are best for long documents, complex topics, or when accuracy is critical.

哪种文本分割方式最适合我的使用场景?#

¥What is the best text splitting for my use case?

这很大程度上取决于你的数据:

¥This again depends a lot on your data:

  • 小块数据(例如,200 到 500 个令牌)适合细粒度检索。

¥Small chunks (for example, 200 to 500 tokens) are good for fine-grained retrieval.

  • 较大的音频块可能包含更多上下文信息,但也可能变得模糊或嘈杂。

¥Large chunks may carry more context but can become diluted or noisy.

使用正确的重叠大小对于 AI 理解数据块的上下文至关重要。这也是为什么使用 Markdown 或代码块拆分通常有助于更好地组织代码块的原因。

¥Using the right overlap size is important for the AI to understand the context of the chunk. That's also why using the Markdown or Code Block splitting can often help to make chunks better.

另一种好方法是为其添加更多上下文信息(例如,关于代码块来源的文档)。如果你想了解更多相关信息,可以查看 Anthropic 的这篇精彩文章

¥Another good approach is to add more context to it (for example, about the document where the chunk came from). If you want you can read more about this, you can check out this great article from Anthropic.