1️⃣ Section 11. ChatGPT API

sky · 2023年04月22日05:26

ChatGPT API

簡單的說， model gpt-3.5-turbo 就是 對話體。新增的 API ChatCompletion，提供了新的訊息格式（其他參數還是可以像之前那樣設定）。

你可以使用這個 API，創造出類似 ChatGPT 的 ChatBot：一問一答，而且可以記住之前的對話（連續性對話，我認為勝過 google 搜尋的優點，感興趣的朋友可以參考這篇： Google search vs. ChatGPT ）。

ChatGPT 模型針對對話進行了最佳化，gpt-3.5-turbo 的效率與 Instruct Davinci 相當。

ChatGPT API 無法進行 fine-tuned（微調，不過未來可能會改變），呼叫的方式也不同。我們看官網 OpenAI API 的範例。

範例程式

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

範例中的 messages 有四部分：

role	content
system	You are a helpful assistant.
user	Who won the World Series in 2020?
assistant	The Los Angeles Dodgers won the World Series in 2020.
user	Where was it played?

首先設定 system，讓 ChatCompletion API 知道要以什麼身份回答問題。就像大家在許多文章看到的，ChatGPT 的 prompt 撰寫技巧：你現在是一個…。

接著第二段第三段分別是使用者的問題，和 ChatGPT 的回答。

值得留意的是第四段使用者的新問題：Where was it played?

當你使用 ChatCompletion API，把四段訊息串在一起時，OpenAI 會知道這裡的 Where 指的是前一問句的相關問題。

提醒：只要不超過文字上限，messages 可以一直串接下去，只有第一段是 system，接下來就是 user 和 assistant 輪流對話。

補充資料

需升級到 0.27.0 以上的版本，如果沒有的話，以下方指令升級：

pip install openai --upgrade

官方說明：Learn more about ChatGPT

價格

與之前 Models 的價格比較

2023年4月22日資料

Model	Training	Usage
gpt-3.5-turbo	$0.0300／1K tokens	$0.0020／1K tokens

Davinci	$0.0300／1K tokens	$0.1200／1K tokens
Ada	$0.0004／1K tokens	$0.0016／1K tokens
Babbage	$0.0006／1K tokens	$0.0024／1K tokens
Curie	$0.0030／1K tokens	$0.0120／1K tokens

為什麼 gpt-3.5-turbo 特別便宜呢？我推測是因為 messages 需要一直重覆之前的對話串。

RHLF

RHLF: Reinforcement Learning from Human Feedback

人類回饋強化學習

WIKI

論文 Paper

Learning to summarize with human feedback

ChatGPT Model Comparison

GPT-3.5

GPT-3.5 models can understand and generate natural language or code. Our most capable and cost effective model in the GPT-3.5 family is gpt-3.5-turbo which has been optimized for chat but works well for traditional completions tasks as well.

LATEST MODEL	DESCRIPTION	MAX TOKENS	TRAINING DATA
gpt-3.5-turbo	Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of `text-davinci-003`. Will be updated with our latest model iteration.	4,096 tokens	Up to Sep 2021
gpt-3.5-turbo-0301	Snapshot of `gpt-3.5-turbo` from March 1st 2023. Unlike `gpt-3.5-turbo`, this model will not receive updates, and will be deprecated 3 months after a new version is released.	4,096 tokens	Up to Sep 2021
text-davinci-003	Can do any language task with better quality, longer output, and consistent instruction-following than the curie, babbage, or ada models. Also supports inserting completions within text.	4,097 tokens	Up to Jun 2021
text-davinci-002	Similar capabilities to `text-davinci-003` but trained with supervised fine-tuning instead of reinforcement learning	4,097 tokens	Up to Jun 2021
code-davinci-002	Optimized for code-completion tasks	8,001 tokens	Up to Jun 2021

We recommend using gpt-3.5-turbo over the other GPT-3.5 models because of its lower cost.

資料來源：

https://platform.openai.com/docs/models/gpt-3-5

專案實作

以之前的範例 History Tutor ChatBot Remix 修改，改用 ChatCompletion 實作

本案例幾個改進原專案的點：

呼叫的 OpenAI API 改為 ChatCompletion
以一個 while 迴圈處理連續對話
計算 token 以免看到帳單昏倒
在 prompt 中加入相關敘述防呆，避免 hallucination 誤判（細節請參考前一節課程 文字嵌入（Text Embedding））

import openai
import os
openai.api_key = os.getenv("OPENAI_API_KEY")

class CreateBot:
    
    def __init__(self, system_prompt):
        '''
        system_prompt: [str] Describes context for Chat Assistant
        '''
        self.system = system_prompt
        self.messages = [{"role": "system", "content": system_prompt}]
        
    
    def chat(self):
        '''
        Tracks dialogue history and takes in user input
        '''
        print('To end conversation, type END')
        question = ''
        while question != 'END':
            # Get User Question
            question = input("")
            
            # Add to messages/dialogue history
            self.messages.append({'role':'user','content':question})

            #Send to ChatGPT and get response
            response = openai.ChatCompletion.create(
                  model="gpt-3.5-turbo",
                  messages=self.messages)

            # Get content of assistant reply
            content = response['choices'][0]['message']['content']
            print('\n')
            print(content)
            print('\n')
            # Add assistant reply for dialogue history
            self.messages.append({'role':'assistant','content':content})

history_tutor = CreateBot(system_prompt="You are an expert in US History")

history_tutor.chat()

Token Length Check

full_content = ''
for item in history_tutor.messages:
    full_content += item['content']

def num_tokens_from_string(string, encoding_name):
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

num_tokens_from_string(full_content,encoding_name='cl100k_base')

Out（輸出）

485

# Don't pop index 0, because that is the system role!
history_tutor.messages.pop(1)

Out（輸出）

{‘role’: ‘user’, ‘content’: ‘Who was the first president of the US?’}

Jacky.Hsiao · 2023年04月25日15:54

越聊 token 會越多，送出 API 時，檢查一下是不是 “END” 可以少計費一次

ChrisWei · 2023年04月26日01:38

所以這位老師有提到，可以適時的刪掉第一個 system role 之後的一些對話(user role)，可以減低 token 的消耗~