GenAI > Dify >

构建基于知识库的 Amazon Bedrock 问答应用程序 - Retrieve API 和 Langchain

背景

使用知识库，我们可以安全地将亚马逊 Bedrock 中的基础模型(FM)连接到我们公司的数据，用于检索增强生成(RAG)。访问额外的数据可帮助模型生成更相关、更具上下文针对性和更准确的响应，而无需不断重新训练 FM。从知识库检索的所有信息都附有来源归属，以提高透明度并最大限度地减少幻觉。有关使用控制台创建知识库的更多信息，请参阅此帖子。

在本笔记本中，我们将深入探讨使用 Amazon Bedrock 知识库提供的 Retrieve API 和 LangChain 构建问答应用程序。我们将查询知识库以获取基于相似性搜索的所需数量的文档块，将其与 LangChain 检索器集成，并使用 Anthropic Claude 即时模型来回答问题。

模式

我们可以使用检索增强生成(RAG)模式来实现解决方案。RAG 从语言模型(非参数)外部检索数据，并通过添加相关检索数据来增强提示。在这里，我们正在有效地对前一个笔记本或使用控制台创建的知识库执行 RAG。

先决条件

在回答问题之前，必须将文档处理并存储在知识库中。

通过连接我们的 s3 存储桶(数据源)将文档加载到知识库中。
摄取 - 知识库将它们拆分成更小的块(基于所选的策略)，生成嵌入并将其存储在相关的矢量存储中，笔记本0_create_ingest_documents_test_kb.ipynb 会为我们处理这个问题。

笔记本演练

对于我们的笔记本，我们将使用 Amazon Bedrock 知识库提供的 Retrieve API，它将用户查询转换为嵌入，搜索知识库，并返回相关结果，为我们提供更多控制权来构建基于语义搜索结果的自定义工作流程。 Retrieve API 的输出包括 retrieved text chunks、源数据的 location type 和 URI，以及检索结果的相关性 scores。

然后，我们将使用 LangChain 提供的 RetrievalQAChain，将 RetreiverAPI 添加为 retriever 到该链中。该链将自动将生成的文本块与原始提示进行增强，并通过 anthropic.claude-instant-v1 模型传递。

提问

我们将使用以下工作流程来完成这个笔记本。

用例:

数据集

在这个例子中，我们将使用亚马逊多年来的股东信作为文本语料库来执行问答。这些数据已经被摄取到 Amazon Bedrock 的知识库中。我们需要 knowledge base id 来运行这个例子。

Python 3.10

⚠ 对于这个实验室，我们需要基于 Python 3.10 运行笔记本。⚠

设置

要运行这个笔记本，我们需要安装依赖项，以及 LangChain 和更新 boto3、botocore，以访问 Amazon Bedrock 知识库提供的新发布的查询 API。

%pip install --upgrade pip
%pip install boto3==1.33.2 --force-reinstall --quiet
%pip install botocore==1.33.2 --force-reinstall --quiet
%pip install langchain>=0.1.11
%pip install pypdf==4.1.0
%pip install langchain-community faiss-cpu==1.8.0 tiktoken==0.6.0 sqlalchemy==2.0.28

使用上面安装的依赖项重新启动内核

# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

store -r kb_id

按照以下步骤设置必要的包

导入创建 bedrock-runtime 以调用基础模型和 bedrock-agent-runtime 客户端以使用 Amazon Bedrock 知识库提供的 Retrieve API 所需的必要库。
导入 Langchain 以:
1. 将 anthropic.claude-v2 初始化为我们的大型语言模型，以使用 RAG 模式执行查询完成。
2. 初始化与知识库集成的 Langchain 检索器。
3. 在本笔记本的后面部分，我们将使用 RetrieverQAChain 包装 LLM 和检索器，以构建我们的问答应用程序。

import boto3
import pprint
from botocore.client import Config
from langchain.llms.bedrock import Bedrock
from langchain.retrievers.bedrock import AmazonKnowledgeBasesRetriever

pp = pprint.PrettyPrinter(indent=2)
bedrock_config = Config(connect_timeout=120, read_timeout=120, retries={'max_attempts': 0})
bedrock_client = boto3.client('bedrock-runtime')
bedrock_agent_client = boto3.client("bedrock-agent-runtime",
                              config=bedrock_config
                              )

model_kwargs_claude = {
    "temperature": 0,
    "top_k": 10,
    "max_tokens_to_sample": 3000
}

llm = Bedrock(model_id="anthropic.claude-instant-v1",
              model_kwargs=model_kwargs_claude,
              client = bedrock_client,)

Retrieve API: 流程

从 LangChain 创建一个 AmazonKnowledgeBasesRetriever 对象，它将调用 Amazon Bedrock 知识库提供的 Retrieve API，该 API 将用户查询转换为嵌入，搜索知识库，并返回相关结果，为我们提供更多控制权来构建基于语义搜索结果的自定义工作流程。 Retrieve API 的输出包括 retrieved text chunks、源数据的 location type 和 URI，以及检索结果的相关性 scores。

retriever = AmazonKnowledgeBasesRetriever(
        knowledge_base_id=kb_id,
        retrieval_config={"vectorSearchConfiguration": {"numberOfResults": 4}},
        # endpoint_url=endpoint_url,
        # region_name="us-east-1",
        # credentials_profile_name="<profile_name>",
    )
docs = retriever.get_relevant_documents(
        query="By what percentage did AWS revenue grow year-over-year in 2022?"
    )
pp.pprint(docs)

score: 我们可以查看每个返回的文本块的相关联分数，该分数描述了它与查询的相关性，即它与查询的匹配程度。

针对模型的特定提示以个性化响应

在这里，我们将使用下面的特定提示，让模型充当一个财务顾问 AI 系统，在可能的情况下使用事实性和统计信息来回答问题。我们将上面的 Retrieve API 响应作为 {context} 的一部分提供给提示，供模型参考，同时还有用户 query。

from langchain.prompts import PromptTemplate

PROMPT_TEMPLATE = """
    Human: You are a financial advisor AI system, and provides answers to questions by using fact based and statistical information when possible. 
    Use the following pieces of information to provide a concise answer to the question enclosed in <question> tags. 
    If you don't know the answer, just say that you don't know, don't try to make up an answer.
    <context>
    {context}
    </context>

    <question>
    {question}
    </question>

    The response should be specific and use statistics or numbers when possible.

    Assistant:"""
claude_prompt = PromptTemplate(template=PROMPT_TEMPLATE, 
                                input_variables=["context","question"])

# fetch context from the response
def get_contexts(docs):
    contexts = []
    for retrievedResult in docs: 
        contexts.append(retrievedResult.page_content)
    return contexts

contexts = get_contexts(docs)
pp.pprint(contexts)

通过 LLM 启动用户提示和响应

在这里，我们将使用 Retrieve API 生成的上下文以及用户查询来格式化我们的提示，以获得我们将用于使用 LLaMaIndex 评估生成答案的最终响应。

query = "By what percentage did AWS revenue grow year-over-year in 2022?"
prompt = claude_prompt.format(context=contexts, 
                                 question=query)

response = llm(prompt)
pp.pprint(response)

将上面定义的检索器和 LLM 与 `RetrievalQA` 链集成，构建问答应用程序。

from langchain.chains import RetrievalQA
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
    chain_type_kwargs={"prompt": claude_prompt}
)

answer = qa(query)
pp.pprint(answer)

下一步

如果我们有兴趣评估我们的 RAG 应用程序，请尝试笔记本: “customized-rag-retrieve-api-titan-lite-evaluation "，在那里我们使用 Amazon Titan Lite 模型生成响应，使用 Anthropic Claude V2 评估响应。