功能简介
支持向量检索(Vector Search)是搜索引擎的一个高级功能,它允许用户在高维向量空间中进行相似性搜索,而不仅仅是基于传统的关键词匹配。
向量检索的核心在于,它通过将文本、图像或其他数据转换为向量(即一组多维的数值表示),然后基于这些向量之间的距离来查找相似的项目。与传统的基于关键字的检索方法相比,向量检索更适合处理复杂的数据类型,如自然语言处理、推荐系统和计算机视觉等场景。
在搜索引擎中,支持向量检索的功能通过集成高效的向量索引结构(如近似最近邻搜索算法)实现。这使得搜索引擎能够在处理大规模数据集时,依然保持高效的查询速度和准确性。通过向量检索,用户可以在海量数据中快速找到与查询向量最相似的结果,从而提升搜索体验和应用的智能化水平。
使用示例
以下是Elasticsearch支持向量检索示例。
创建索引:
PUT my-knn-index-1
{
"settings": {
"index": {
"knn": true,
"knn.algo_param.ef_search": 100
}
},
"mappings": {
"properties": {
"category": {
"type": "keyword"
},
"brand": {
"type": "keyword"
},
"style": {
"type": "keyword"
},
"my_vector": {
"type": "knn_vector",
"dimension": 3
}
}
}
}
插入数据:
PUT my-knn-index-1/_doc/1
{
"category": "electronics",
"brand": "brandA",
"style": "modern",
"my_vector": [0.5, 0.8, 0.3]
}
PUT my-knn-index-1/_doc/2
{
"category": "furniture",
"brand": "brandB",
"style": "vintage",
"my_vector": [0.2, 0.4, 0.7]
}
PUT my-knn-index-1/_doc/3
{
"category": "clothing",
"brand": "brandC",
"style": "casual",
"my_vector": [0.9, 0.1, 0.6]
}
查询:
POST my-knn-index-1/_search
{
"size": 10,
"query": {
"knn": {
"my_vector": {
"vector": [0.5, 0.8, 0.3],
"k": 2
}
}
}
}
返回结果:
{
"took": 654,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "my-knn-index-1",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"category": "electronics",
"brand": "brandA",
"style": "modern",
"my_vector": [
0.5,
0.8,
0.3
]
}
},
{
"_index": "my-knn-index-1",
"_type": "_doc",
"_id": "2",
"_score": 0.7092199,
"_source": {
"category": "furniture",
"brand": "brandB",
"style": "vintage",
"my_vector": [
0.2,
0.4,
0.7
]
}
},
{
"_index": "my-knn-index-1",
"_type": "_doc",
"_id": "3",
"_score": 0.57471263,
"_source": {
"category": "clothing",
"brand": "brandC",
"style": "casual",
"my_vector": [
0.9,
0.1,
0.6
]
}
}
]
}
}