基于RAG的对话式搜索
对话式搜索允许您使用自然语言提问,并通过后续问题来完善答案。因此,对话成为您与大型语言模型(LLM)之间的交互。要实现这一点,模型需要记住整个对话的上下文,而不仅仅是单独回答每个问题。
对话式搜索通过以下组件实现:
对话历史:允许LLM记住当前对话的上下文并理解后续问题。
检索增强生成(RAG):允许LLM用专有或当前信息补充其静态知识库。
对话历史
对话历史由一个简单的类CRUD API组成,包含两个资源:记忆(memories)和消息(messages)。当前对话的所有消息都存储在一个对话记忆中。消息代表一个问答对:人类输入的问题和AI的回答。消息不能单独存在;它们必须添加到记忆中。
RAG
RAG从索引和历史中检索数据,并将所有信息作为上下文发送给LLM。然后,LLM用动态检索的数据补充其静态知识库。在UDB-SX中,RAG通过包含检索增强生成处理器的搜索管道实现。该处理器拦截UDB-SX查询结果,从对话记忆中检索对话中的先前消息,并向LLM发送提示。处理器从LLM接收到响应后,会将响应保存在对话记忆中,并返回原始UDB-SX查询结果和LLM响应。
截至UDB-SX 2.11版本,RAG技术仅与OpenAI模型和Amazon Bedrock上的Anthropic Claude模型进行了测试。
当启用Security插件时,所有记忆都以private安全模式存在。只有创建记忆的用户可以与该记忆交互。没有用户可以看到其他用户的记忆。
先决条件
要开始使用对话式搜索,请启用对话记忆和RAG管道功能:
PUT /_cluster/settings
{
"persistent": {
"plugins.ml_commons.memory_feature_enabled": true,
"plugins.ml_commons.rag_pipeline_feature_enabled": true
}
}
配置对话式搜索
有两种方式配置对话式搜索:
自动化工作流
UDB-SX提供了工作流模板,可自动创建LLM连接器、注册并部署LLM,以及配置搜索管道。创建工作流时必须提供已配置LLM的API密钥。请查看对话式搜索工作流模板的默认配置(配置相关信息可联系售前工作人员获取),确定是否需要更新任何参数。例如,若模型端点与默认值(https://api.cohere.ai/v1/chat)不同,请在create_connector.actions.url参数中指定模型端点。要创建默认对话式搜索工作流,请发送以下请求:
POST /_plugins/_flow_framework/workflow?use_case=conversational_search_with_llm_deploy&provision=true
{
"create_connector.credential.key": "<YOUR_API_KEY>"
}
UDB-SX将返回创建的工作流ID:
{
"workflow_id" : "U_nMXJUBq_4FYQzMOS4B"
}
要检查工作流状态,请发送以下请求:
GET /_plugins/_flow_framework/workflow/U_nMXJUBq_4FYQzMOS4B/_status
工作流完成后,state将变为COMPLETED。该工作流会创建以下组件:
模型连接器:连接到指定模型
已注册并部署的模型:模型已准备就绪
搜索管道:配置为处理对话式查询 现在您可以继续执行步骤4、5和6:将RAG数据摄取到索引、创建对话记忆,并使用管道进行RAG。
手动设置
要手动配置对话式搜索,请按以下步骤操作:
1、为模型创建连接器。 2、注册并部署模型。 3、创建搜索管道。 4、将RAG数据摄取到索引中。 5、创建对话记忆。 6、使用管道进行RAG。
步骤1:为模型创建连接器
RAG需要LLM才能运行。要连接LLM,请创建一个连接器。以下请求为OpenAI GPT 3.5模型创建连接器:
POST /_plugins/_ml/connectors/_create
{
"name": "OpenAI Chat Connector",
"description": "The connector to public OpenAI model service for GPT 3.5",
"version": 2,
"protocol": "http",
"parameters": {
"endpoint": "api.openai.com",
"model": "gpt-3.5-turbo",
"temperature": 0
},
"credential": {
"openAI_key": "<YOUR_OPENAI_KEY>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://${parameters.endpoint}/v1/chat/completions",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": """{ "model": "${parameters.model}", "messages": ${parameters.messages}, "temperature": ${parameters.temperature} }"""
}
]
}
UDB-SX会返回一个连接器 ID:
{
"connector_id": "u3DEbI0BfUsSoeNTti-1"
}
有关连接其他服务和模型的示例请求,请参阅连接器蓝图。
步骤2:注册并部署模型
注册上一步创建连接器的LLM。要向UDB-SX注册模型,请提供上一步返回的connector_id:
POST /_plugins/_ml/models/_register
{
"name": "openAI-gpt-3.5-turbo",
"function_name": "remote",
"description": "test model",
"connector_id": "u3DEbI0BfUsSoeNTti-1"
}
UDB-SX将返回注册任务的任务ID和注册模型的模型ID:
{
"task_id": "gXDIbI0BfUsSoeNT_jAb",
"status": "CREATED",
"model_id": "gnDIbI0BfUsSoeNT_jAw"
}
要验证注册是否完成,请调用任务API:
GET /_plugins/_ml/tasks/gXDIbI0BfUsSoeNT_jAb
响应中state将变为COMPLETED:
{
"model_id": "gnDIbI0BfUsSoeNT_jAw",
"task_type": "REGISTER_MODEL",
"function_name": "REMOTE",
"state": "COMPLETED",
"worker_node": [
"kYv-Z5-mQ4uCUy_cRC6LXA"
],
"create_time": 1706927128091,
"last_update_time": 1706927128125,
"is_async": false
}
要部署模型,请向部署API提供model_id:
POST /_plugins/_ml/models/gnDIbI0BfUsSoeNT_jAw/_deploy
UDB-SX将确认模型已部署:
{
"task_id": "cnDObI0BfUsSoeNTDzGd",
"task_type": "DEPLOY_MODEL",
"status": "COMPLETED"
}
步骤3:创建搜索管道
接下来,使用retrieval_augmented_generation处理器创建搜索管道:
PUT /_search/pipeline/rag_pipeline
{
"response_processors": [
{
"retrieval_augmented_generation": {
"tag": "openai_pipeline_demo",
"description": "Demo pipeline Using OpenAI Connector",
"model_id": "gnDIbI0BfUsSoeNT_jAw",
"context_field_list": ["text"],
"system_prompt": "You are a helpful assistant",
"user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
}
}
]
}
有关处理器字段的信息,请参阅检索增强生成处理器。
步骤4:将RAG数据摄取到索引中
RAG用补充数据增强LLM的知识。
首先,创建一个索引存储这些数据,并将默认搜索管道设置为上一步创建的管道:
PUT /my_rag_test_data
{
"settings": {
"index.search.default_pipeline" : "rag_pipeline"
},
"mappings": {
"properties": {
"text": {
"type": "text"
}
}
}
}
接下来,将补充数据摄取到索引中:
POST _bulk
{"index": {"_index": "my_rag_test_data", "_id": "1"}}
{"text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."}
{"index": {"_index": "my_rag_test_data", "_id": "2"}}
{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."}
RAG管道
RAG是一种从索引检索文档,通过seq2seq模型(如LLM)处理,然后用动态检索的数据补充静态LLM信息的技术。
截至UDB-SX 25.0.0.0版本,RAG技术仅在OpenAI模型、Amazon Bedrock上的Anthropic Claude模型以及Cohere Command模型上经过测试。
配置Cohere Command模型启用RAG需要使用后处理函数转换模型输出。更多信息请参阅Cohere RAG教程(该教程可联系售前工作人员获取)。
步骤5:创建对话记忆
您需要创建一个对话记忆来存储对话中的所有消息。为便于识别,请在可选的name字段中为记忆命名,如下例所示。由于name参数不可更新,这是您为对话命名的唯一机会。
POST /_plugins/_ml/memory/
{
"name": "Conversation about NYC population"
}
UDB-SX将返回新创建记忆的记忆ID:
{
"memory_id": "znCqcI0BfUsSoeNTntd7"
}
您将使用此memory_id向记忆中添加消息。
步骤6:使用管道进行RAG
要使用RAG管道,请向UDB-SX发送查询,并在ext. generative_qa_parameters对象中提供额外参数。
generative_qa_parameters对象支持以下参数:
| 参数 | 必需 | 描述 |
|---|---|---|
llm_question |
是 | LLM必须回答的问题 。 |
llm_model |
否 | 在需要使用不同模型时覆盖连接中设置的原始模型(例如用GPT-4替代GPT-3.5)。如果在管道创建期间未设置默认模型,则此选项为必需。 |
memory_id |
否 | 如果提供memory_id,管道将检索指定记忆中最近的10条消息并添加到LLM提示中。若未指定,LLM提示中将不添加先前上下文 。 |
context_size |
否 | 发送给LLM的搜索结果数量。通常需要控制令牌大小限制(因模型而异)。或者,您可以使用搜索API中的size参数控制发送给LLM的搜索结果数量。 |
message_size |
否 | 发送给LLM的消息数量。与搜索结果数量类似,这会影响LLM接收的令牌总数。未设置时,管道使用默认消息大小10。 |
timeout |
否 | 管道等待使用连接器的远程模型响应的秒数。默认为30。 |
如果您的LLM包含令牌限制,请在UDB-SX查询中设置
size字段以限制搜索响应中使用的文档数量。否则,RAG管道会将搜索结果中的每个文档都发送给LLM。
如果询问LLM关于当前的问题,它无法回答,因为它是在几年前的数据上训练的。但是,如果将当前信息作为上下文添加,LLM就能生成响应。例如,您可以询问LLM 2023年纽约都会区的人口。您将构建一个包含UDB-SX匹配查询和LLM查询的查询。提供memory_id以便将消息存储在适当的记忆对象中:
GET /my_rag_test_data/_search
{
"query": {
"match": {
"text": "What's the population of NYC metro area in 2023"
}
},
"ext": {
"generative_qa_parameters": {
"llm_model": "gpt-3.5-turbo",
"llm_question": "What's the population of NYC metro area in 2023",
"memory_id": "znCqcI0BfUsSoeNTntd7",
"context_size": 5,
"message_size": 5,
"timeout": 15
}
}
}
由于上下文包含关于纽约人口的文档,LLM能够正确回答问题(尽管它添加了”预计”一词,因为它是在往年数据上训练的)。响应包含来自补充RAG数据的匹配文档和LLM响应:
响应
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 5.781642,
"hits": [
{
"_index": "my_rag_test_data",
"_id": "2",
"_score": 5.781642,
"_source": {
"text": """Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."""
}
},
{
"_index": "my_rag_test_data",
"_id": "1",
"_score": 0.9782871,
"_source": {
"text": "Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."
}
}
]
},
"ext": {
"retrieval_augmented_generation": {
"answer": "The population of the New York City metro area in 2023 is projected to be 18,937,000.",
"message_id": "x3CecI0BfUsSoeNT9tV9"
}
}
}
现在,您将作为同一对话的一部分向LLM提出追问。同样,在请求中提供memory_id:
GET /my_rag_test_data/_search
{
"query": {
"match": {
"text": "What was it in 2022"
}
},
"ext": {
"generative_qa_parameters": {
"llm_model": "gpt-3.5-turbo",
"llm_question": "What was it in 2022",
"memory_id": "znCqcI0BfUsSoeNTntd7",
"context_size": 5,
"message_size": 5,
"timeout": 15
}
}
}
LLM正确识别对话主题并返回相关响应:
{
...
"ext": {
"retrieval_augmented_generation": {
"answer": "The population of the New York City metro area in 2022 was 18,867,000.",
"message_id": "p3CvcI0BfUsSoeNTj9iH"
}
}
}
要验证两条消息是否都已添加到记忆中,请向获取消息API提供memory_ID:
GET /_plugins/_ml/memory/znCqcI0BfUsSoeNTntd7/messages
响应包含两条消息:
响应
{
"messages": [
{
"memory_id": "znCqcI0BfUsSoeNTntd7",
"message_id": "x3CecI0BfUsSoeNT9tV9",
"create_time": "2024-02-03T20:33:50.754708446Z",
"input": "What's the population of NYC metro area in 2023",
"prompt_template": """[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Generate a concise and informative answer in less than 100 words for the given question"}]""",
"response": "The population of the New York City metro area in 2023 is projected to be 18,937,000.",
"origin": "retrieval_augmented_generation",
"additional_info": {
"metadata": """["Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019.","Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."]"""
}
},
{
"memory_id": "znCqcI0BfUsSoeNTntd7",
"message_id": "p3CvcI0BfUsSoeNTj9iH",
"create_time": "2024-02-03T20:36:10.24453505Z",
"input": "What was it in 2022",
"prompt_template": """[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Generate a concise and informative answer in less than 100 words for the given question"}]""",
"response": "The population of the New York City metro area in 2022 was 18,867,000.",
"origin": "retrieval_augmented_generation",
"additional_info": {
"metadata": """["Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019.","Abraham Lincoln was born on February 12, 1809, the second child of Thomas Lincoln and Nancy Hanks Lincoln, in a log cabin on Sinking Spring Farm near Hodgenville, Kentucky.[2] He was a descendant of Samuel Lincoln, an Englishman who migrated from Hingham, Norfolk, to its namesake, Hingham, Massachusetts, in 1638. The family then migrated west, passing through New Jersey, Pennsylvania, and Virginia.[3] Lincoln was also a descendant of the Harrison family of Virginia; his paternal grandfather and namesake, Captain Abraham Lincoln and wife Bathsheba (née Herring) moved the family from Virginia to Jefferson County, Kentucky.[b] The captain was killed in an Indian raid in 1786.[5] His children, including eight-year-old Thomas, Abraham's father, witnessed the attack.[6][c] Thomas then worked at odd jobs in Kentucky and Tennessee before the family settled in Hardin County, Kentucky, in the early 1800s."]"""
}
}
]
}
后续步骤
探索我们的教程,了解如何构建AI搜索应用程序。