什么是向量数据库？(What are vector databases?)#

向量数据库以数字形式存储信息：

🌐 Vector databases store information as numbers:

向量数据库是一种将数据存储为高维向量的数据库，这些向量是特性或属性的数学表示。 (source)

这使得快速且准确的相似性搜索成为可能。使用向量数据库时，你可以基于语义和上下文意义搜索相关数据，而不是使用传统的数据库查询。

🌐 This enables fast and accurate similarity searches. With a vector database, instead of using conventional database queries, you can search for relevant data based on semantic and contextual meaning.

简化的示例(A simplified example)#

向量数据库可以存储句子“n8n 是一个可以自我托管的开源自动化工具”，但它并不是以文本形式存储，而是将其存储为表示该句子特性的一组维度数组（介于 0 和 1 之间的数字）。这并不是将句子中的每个字母转换为数字。相反，向量数据库中的向量描述了整个句子。

🌐 A vector database could store the sentence "n8n is a source-available automation tool that you can self-host", but instead of storing it as text, the vector database stores an array of dimensions (numbers between 0 and 1) that represent its features. This doesn't mean turning each letter in the sentence into a number. Instead, the vectors in the vector database describe the sentence.

假设在一个向量存储中，0.1 表示 automation tool，0.2 表示 source available，0.3 表示 can be self-hosted。你可能会得到以下向量：

🌐 Suppose that in a vector store 0.1 represents automation tool, 0.2 represents source available, and 0.3 represents can be self-hosted. You could end up with the following vectors:

句子	向量（维度数组）
n8n 是一个可自托管的源代码自动化工具	[0.1, 0.2, 0.3]
Zapier 是一个自动化工具	[0.1]
Make 是一个自动化工具	[0.1]
Confluence 是一个可自托管的 wiki 工具	[0.3]

这个例子非常简化

在实际操作中，向量要复杂得多。向量的维度可以从几十到上千不等。各个维度与单一特性之间没有一一对应关系，因此你不能将单独的维度直接转化为单一概念。这个例子提供的是一个大致的心理模型，而不是一个真正的技术理解。

演示相似性搜索的强大功能(Demonstrating the power of similarity search)#

Qdrant 提供了向量搜索演示，以帮助用户了解向量数据库的强大功能。美食发现演示展示了向量存储如何根据视觉相似性匹配图片。

🌐 Qdrant provides vector search demos to help users understand the power of vector databases. The food discovery demo shows how a vector store can help match pictures based on visual similarities.

此演示使用来自外卖服务的数据。用户可以对菜品照片表示喜欢或不喜欢，应用将根据菜品的外观推荐更多类似的餐食。还可以选择仅查看配送范围内餐厅的结果。(来源)

有关完整的技术详情，请参阅 Qdrant demo-food-discovery GitHub 仓库。

🌐 For full technical details, refer to the Qdrant demo-food-discovery GitHub repository.

嵌入、检索器、文本分割器和文档加载器(Embeddings, retrievers, text splitters, and document loaders)#

向量数据库需要其他工具才能运行：

🌐 Vector databases require other tools to function:

文档加载器和文本拆分器：文档加载器用于导入文档和数据，并将其准备好用于嵌入。文档加载器可以使用文本拆分器将文档拆分成块。
嵌入：这些是将数据（文本、图片等）转换为向量，并再次转换为原始数据的工具。请注意，n8n 仅支持文本嵌入。
检索器：检索器从向量数据库中获取文档。你需要将它们与嵌入配对，以将向量转换回数据。