《筆記》Agentic RAG：從檢索到 AI 代理

sky · 2025年07月15日13:07

本文介紹如何從基礎的 RAG（檢索增強生成）系統演進到具備決策能力的 Agentic RAG 系統。透過實際程式碼範例和逐步實作，展示如何建立一個 AI 代理的課程助手系統。

▌什麼是 Agentic RAG？

傳統 RAG 系統遵循固定的「搜尋→提示→生成」流程，而 Agentic RAG 引入了 決策機制，讓系統能夠：

自主決定是否需要搜尋
執行多輪搜尋策略
使用多種工具
維護對話上下文
動態調整行為策略

▌基礎 RAG 回顧

RAG 的三大核心組件

根據課程資料，基礎 RAG 系統包含三個主要組件：

def rag(query):
 search_results = search(query) # 1. 搜尋
 prompt = build_prompt(query, search_results) # 2. 提示建構
 answer = llm(prompt) # 3. LLM 生成
 return answer

1. 搜尋組件實作

使用 minsearch 建立索引和搜尋功能：

from minsearch import AppendableIndex

# 建立索引
index = AppendableIndex(
    text_fields=["question", "text", "section"],
    keyword_fields=["course"]
)

index.fit(documents)

def search(query):
    boost = {'question': 3.0, 'section': 0.5}
    
    results = index.search(
        query=query,
        filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5,
        output_ids=True
    )
    
    return results

關鍵設計要點：

text_fields：執行全文搜尋的欄位
keyword_fields：用於精確過濾的欄位
boost：給予特定欄位更高權重（question 欄位比 text 重要 3 倍）

2. 提示建構組件

prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

<QUESTION>
{question}
</QUESTION>

<CONTEXT>
{context}
</CONTEXT>
""".strip()

def build_prompt(query, search_results):
    context = ""
    for doc in search_results:
        context += f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"
    
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

3. LLM 組件

from openai import OpenAI
client = OpenAI()

def llm(prompt):
    response = client.chat.completions.create(
        model='gpt-4o-mini',
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

基礎 RAG 的限制

僵化的流程：無法根據問題類型調整策略
單次搜尋：無法進行多輪資訊收集
缺乏決策能力：無法判斷何時需要搜尋
工具限制：只能使用搜尋功能

▌從 RAG 到 Agentic RAG 的演進

第一步：增加決策機制

首先改造提示模板，讓 LLM 能夠決定下一步行動：

prompt_template = """
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
{question}
</QUESTION>

<CONTEXT> 
{context}
</CONTEXT>

If CONTEXT is EMPTY, you can use our FAQ database.
In this case, use the following output template:

{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>"
}}

If you can answer the QUESTION using CONTEXT, use this template:

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge to answer the question

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}
""".strip()

實作簡單的代理邏輯

def agentic_rag_v1(question):
    context = "EMPTY"
    prompt = prompt_template.format(question=question, context=context)
    answer_json = llm(prompt)
    answer = json.loads(answer_json)
    
    if answer['action'] == 'SEARCH':
        print('need to perform search...')
        search_results = search(question)
        context = build_context(search_results)
        
        prompt = prompt_template.format(question=question, context=context)
        answer_json = llm(prompt)
        answer = json.loads(answer_json)
    
    return answer

關鍵改進：

LLM 現在可以決定是否需要搜尋
區分不同的資訊來源（上下文 vs 自身知識）
使用結構化的 JSON 輸出格式

▌Agentic RAG 核心實作

代理系統的特徵

根據課程資料，代理系統具備以下能力：

決策制定：分析請求並決定採取什麼行動
工具使用：使用多種工具完成任務
狀態維護：保持上下文和對話歷史
目標導向：朝特定目標努力
學習適應：從互動中學習

代理流程設計

# 典型的代理流程包含：
# 1. 接收用戶請求
# 2. 分析請求和可用工具
# 3. 決定下一步行動
# 4. 執行行動
# 5. 評估結果
# 6. 完成任務或繼續執行

▌代理式搜尋策略

多輪搜尋機制

進階版本支援多次搜尋和複雜的決策邏輯：

prompt_template = """
You're a course teaching assistant.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents.
PREVIOUS_ACTIONS contains the actions you already performed.

You can perform the following actions:
- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic.

Don't use search queries used at the previous iterations.
Don't repeat previously performed actions.
Don't perform more than {max_iterations} iterations.

Output templates:

If you want to perform search, use this template:
{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>",
"keywords": ["search query 1", "search query 2", ...]
}}

If you can answer the QUESTION using CONTEXT, use this template:
{{
"action": "ANSWER_CONTEXT",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge:
{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}

<QUESTION>{question}</QUESTION>
<SEARCH_QUERIES>{search_queries}</SEARCH_QUERIES>
<CONTEXT>{context}</CONTEXT>
<PREVIOUS_ACTIONS>{previous_actions}</PREVIOUS_ACTIONS>
""".strip()

多輪搜尋實作

def agentic_search(question):
    search_queries = []
    search_results = []
    previous_actions = []
    iteration = 0
    
    while True:
        context = build_context(search_results)
        prompt = prompt_template.format(
            question=question,
            context=context,
            search_queries="\n".join(search_queries),
            previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
            max_iterations=3,
            iteration_number=iteration
        )
        
        answer_json = llm(prompt)
        answer = json.loads(answer_json)
        previous_actions.append(answer)
        
        action = answer['action']
        if action != 'SEARCH':
            break
        
        # 執行搜尋
        keywords = answer['keywords']
        search_queries = list(set(search_queries) | set(keywords))
        
        for k in keywords:
            res = search(k)
            search_results.extend(res)
        
        search_results = dedup(search_results)  # 去除重複結果
        
        iteration += 1
        if iteration >= 4:
            break
    
    return answer

去重機制

def dedup(seq):
    seen = set()
    result = []
    for el in seq:
        _id = el['_id']
        if _id in seen:
            continue
        seen.add(_id)
        result.append(el)
    return result

多輪搜尋的優勢：

深度探索：可以根據初始結果進行更深入的搜尋
多角度查詢：使用不同的關鍵詞組合
避免重複：記錄已使用的查詢，避免重複搜尋
迭代限制：防止無限循環

▌Function Calling 機制

工具描述格式

OpenAI 的 Function Calling 需要標準化的工具描述：

search_tool = {
    "type": "function",
    "name": "search",
    "description": "Search the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query text to look up in the course FAQ."
            }
        },
        "required": ["query"],
        "additionalProperties": False
    }
}

Function Calling 實作

# 發送請求
response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=[search_tool]
)

# 處理函數呼叫
def do_call(tool_call_response):
    function_name = tool_call_response.name
    arguments = json.loads(tool_call_response.arguments)
    
    f = globals()[function_name]
    result = f(**arguments)
    
    return {
        "type": "function_call_output",
        "call_id": tool_call_response.call_id,
        "output": json.dumps(result, indent=2),
    }

多輪對話處理

while True:  # 主要問答循環
    question = input()
    if question == 'stop':
        break
    
    message = {"role": "user", "content": question}
    chat_messages.append(message)
    
    while True:  # 請求-回應循環
        response = client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=tools
        )
        
        has_messages = False
        
        for entry in response.output:
            chat_messages.append(entry)
            
            if entry.type == 'function_call':
                result = do_call(entry)
                chat_messages.append(result)
            elif entry.type == 'message':
                print(entry.content[0].text)
                has_messages = True
        
        if has_messages:
            break

Function Calling的優勢：

結構化輸出：確保工具呼叫格式正確
型別安全：自動驗證參數型別
多工具支援：可以同時註冊多個工具
歷史追蹤：自動維護對話歷史

▌多工具整合與系統架構

工具管理系統

課程提供了完整的工具管理框架：

class Tools:
    def __init__(self):
        self.tools = {}
        self.functions = {}

    def add_tool(self, function, description):
        self.tools[function.__name__] = description
        self.functions[function.__name__] = function
    
    def get_tools(self):
        return list(self.tools.values())

    def function_call(self, tool_call_response):
        function_name = tool_call_response.name
        arguments = json.loads(tool_call_response.arguments)

        f = self.functions[function_name]
        result = f(**arguments)

        return {
            "type": "function_call_output",
            "call_id": tool_call_response.call_id,
            "output": json.dumps(result, indent=2),
        }

多工具範例：添加 FAQ 條目

除了搜尋功能，系統還支援動態添加 FAQ 條目：

def add_entry(question, answer):
    doc = {
        'question': question,
        'text': answer,
        'section': 'user added',
        'course': 'data-engineering-zoomcamp'
    }
    index.append(doc)

add_entry_description = {
    "type": "function",
    "name": "add_entry",
    "description": "Add an entry to the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "question": {
                "type": "string",
                "description": "The question to be added to the FAQ database",
            },
            "answer": {
                "type": "string",
                "description": "The answer to the question",
            }
        },
        "required": ["question", "answer"],
        "additionalProperties": False
    }
}

聊天介面設計

class ChatInterface:
    def input(self):
        question = input("You:")
        return question
    
    def display(self, message):
        print(message)

    def display_function_call(self, entry, result):
        call_html = f"""
            <details>
            <summary>Function call: <tt>{entry.name}({shorten(entry.arguments)})</tt></summary>
            <div>
                <b>Call</b>
                <pre>{entry}</pre>
            </div>
            <div>
                <b>Output</b>
                <pre>{result['output']}</pre>
            </div>
            </details>
        """
        display(HTML(call_html))

    def display_response(self, entry):
        response_html = markdown.markdown(entry.content[0].text)
        html = f"""
            <div>
                <div><b>Assistant:</b></div>
                <div>{response_html}</div>
            </div>
        """
        display(HTML(html))

▌完整課程助手案例

主要組件整合

class ChatAssistant:
    def __init__(self, tools, developer_prompt, chat_interface, client):
        self.tools = tools
        self.developer_prompt = developer_prompt
        self.chat_interface = chat_interface
        self.client = client
    
    def gpt(self, chat_messages):
        return self.client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=self.tools.get_tools(),
        )

    def run(self):
        chat_messages = [
            {"role": "developer", "content": self.developer_prompt},
        ]

        while True:
            question = self.chat_interface.input()
            if question.strip().lower() == 'stop':
                self.chat_interface.display("Chat ended.")
                break

            message = {"role": "user", "content": question}
            chat_messages.append(message)

            while True:  # 內部請求循環
                response = self.gpt(chat_messages)

                has_messages = False

                for entry in response.output:
                    chat_messages.append(entry)

                    if entry.type == "function_call":
                        result = self.tools.function_call(entry)
                        chat_messages.append(result)
                        self.chat_interface.display_function_call(entry, result)

                    elif entry.type == "message":
                        self.chat_interface.display_response(entry)
                        has_messages = True

                if has_messages:
                    break

系統使用範例

# 初始化工具
tools = Tools()
tools.add_tool(search, search_tool)
tools.add_tool(add_entry, add_entry_description)

# 設定系統提示
developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

# 建立助手
chat_interface = ChatInterface()
chat = ChatAssistant(
    tools=tools,
    developer_prompt=developer_prompt,
    chat_interface=chat_interface,
    client=client
)

# 執行聊天
chat.run()

▌框架整合與最佳實踐

PydanticAI 整合

課程還介紹了如何使用 PydanticAI 框架簡化開發：

from pydantic_ai import Agent, RunContext
from typing import Dict

chat_agent = Agent(  
    'openai:gpt-4o-mini',
    system_prompt=developer_prompt
)

@chat_agent.tool
def search_tool(ctx: RunContext, query: str) -> Dict[str, str]:
    """
    Search the FAQ for relevant entries matching the query.
    """
    print(f"search('{query}')")
    return search(query)

@chat_agent.tool
def add_entry_tool(ctx: RunContext, question: str, answer: str) -> None:
    """
    Add a new question-answer entry to FAQ.
    """
    return add_entry(question, answer)

# 使用
user_prompt = "I just discovered the course. Can I join now?"
agent_run = await chat_agent.run(user_prompt)
print(agent_run.output)

PydanticAI 的優勢：

自動工具描述：從 docstring 自動生成工具描述
型別安全：利用 Python 型別系統
簡化API：減少樣板代碼
異步支援：原生支援異步操作

最佳實踐總結

模組化設計

將工具、介面、核心邏輯分離
使用類別封裝相關功能
提供清晰的 API 介面

錯誤處理

設定迭代次數限制
處理 JSON 解析錯誤
提供友好的錯誤訊息

效能優化

實作結果去重機制
限制搜尋結果數量
快取常用查詢

使用者體驗

提供豐富的 HTML 顯示
顯示工具呼叫過程
支援 Markdown 格式化

▌總結與關鍵要點

技術演進路徑

基礎RAG → 決策式RAG → 多輪搜尋 → 多工具整合 → 框架整合
核心改進點：

從固定流程到自主決策
從單次查詢到多輪探索
從單一工具到多工具生態
從手工編碼到框架支援

關鍵技術組件

提示工程：設計結構化的決策提示
狀態管理：維護對話歷史和搜尋記錄
工具系統：標準化的工具註冊和呼叫機制
介面設計：用戶友好的互動介面

實際應用價值

智能客服：能夠自主搜尋和學習的 FAQ 系統
教育助手：個人化的課程輔導系統
知識管理：動態更新的企業知識庫
研究助手：多來源資訊整合與分析

未來發展方向

基於附件內容，可以看出 Agentic RAG 的發展趨勢：

更智能的決策邏輯：更複雜的推理和規劃能力
更豐富的工具生態：整合更多外部 API 和服務
更好的用戶體驗：更自然的對話和更準確的回答
更強的學習能力：從互動中持續改進系統表現

透過本報告的詳細分析和實作範例，讀者可以掌握從基礎 RAG 到 Agentic RAG 的完整技術棧，並能夠建立自己的智能代理系統。這種技術架構不僅提供了更好的用戶體驗，也為未來的 AI 應用奠定了堅實的基礎。