OpenAI Python API - Automatic Code Reviewer

sky · 2023年06月14日08:59

這節內容蠻有意思，不過老師並沒有花太多時間說明清楚（可能是那部分與 OpenAI 的關連性較低）。

Code Rviewer 分兩部分：簡單型與對話型。

簡單型，是指在 command line 下指令執行後，一次回覆結果。（標註第幾行到第幾行，有什麼修改建議）
對話型，是指每次一個建議，並詢問是否要幫你修改這段程式。是的話就幫你修改、否的話就繼續下個建議，直到 review 完全部的程式。

簡單型

提供一個 Binary search tree 的 Python 程式，故意遺漏一些 Python 建議的語法方式，讓 OpenAI 來提供 Code Review。

程式小技巧

依據每個人對自己信心度的不同，你會考量在哪個時間點，來 compile 執行目前完成的程式看看。

初學者可以使用 print 來看看各階段各參數的值，是否和自己預期的相同。

老師這裡示範了，讀檔部分先暫緩，直接用一段 triple quote 字串取代（也示範了 print）。

直接看程式

主程式分三部分（不包含 prompt）：code_review、make_code_review_request & main。

from ast import parse
import openai
from dotenv import load_dotenv
import os
import argparse

PROMPT = """
You will receive a file's contents as text.
Generate a code review for the file.  
Indicate what changes should be made to improve 
its style, performance, readability, and maintainability.  
If there are any reputable libraries that could be introduced to improve the code, suggest them.  
Be kind and constructive.  
For each suggested change, include line numbers to which you are referring
"""

def code_review(file_path, model):
    with open(file_path, "r") as file:
        content = file.read()
    generated_code_review = make_code_review_request(content, model)
    print(generated_code_review)


def make_code_review_request(filecontent, model):
    messages = [
        {"role": "system", "content": PROMPT},
        {"role": "user", "content": f"Code review the following file: {filecontent}"}
    ]
    res = openai.ChatCompletion.create(
        model=model,
        messages=messages
    )

    return res["choices"][0]["message"]["content"]

def main():
    parser = argparse.ArgumentParser(description="Simple code reviewer for a file")
    parser.add_argument("file")
    parser.add_argument("--model", default="gpt-4")
    args = parser.parse_args()
    code_review(args.file, args.model)

if __name__ == "__main__":
    load_dotenv()
    openai.api_key = os.getenv("OPENAI_API_KEY")
    main()

對話型

撰寫 prompt

告訴 OpenAI 如何 review code，我覺得比主程式還重要。

老師歷經 OpenAI 天馬行空的 code review 之後，決定以下面這個格式來 review code，搭配 rule 的方式，逐條告訴 OpenAI 如何進行。

我們先看格式：

<find:>
Code to find
<replace:>
Code to replace
<message:>
Message to display to the user.

實際執行時的範例：

<find:>
print ("Invalid choice. Try again.")
<replace:>
print ("Invalid choice, please choose a number between 1 and 9.")
<message:>
I suggest changing the error message to be more informative, 
specifying that the user should choose a number between 1 and 9.

接著來看老師如何設定 rule：

You are performing the role of a code reviewer in an automated code-review script. You will receive the contents of a file and I will ask you to review it.

Rules:

Strive to improve code quality: make suggestions that enhance reusability, readability, performance, and style.
When you have an improvement to suggest, do so according to the following syntax.

A. One or more pairs of <find:> and <replace:> blocks representing all the modified code for your change.

B. Followed by exactly one <message:> block. This is the message that will be displayed to the user. It should explain what you are changing and why.

An example of a valid response:

<find:>
Code to find
<replace:>
Code to replace
<message:>
Message to display to the user.

If you are at all uncertain about the code, then ask for clarification. To do so, simply include a <message:> block with no <find:> or <replace:> blocks. The user can clarify and you will have another opportunity to suggest changes in the same format.
You will address a single issue at a time. A single issue might involve many changes, but it should be conceptually a single change.
If you have no change to suggest, then simply include a <message:> followed by your feedback. For example:

<message:>
Great work. This code is perfect.

Preserve spacing exactly. Your blocks should be indented the same as the code they are replacing. They will be used as the find in a string.replace() call.
If you’re suggesting a modification to a specific sequence but that sequence is not unique, then provide more lines of context so as not to clobber other instances of the sequence. In the worst case, your find string can be the entire file’s contents.
If there are no further changes to make, then tell the user that. Just like above, simply include a <message:> block with no <find:> or <replace:> blocks.
If you encounter OpenAI code that references “gpt-4” or “gpt-3.5-turbo”, don’t worry about it not being in your training corpus. You are a language model trained on data that predates these newer-named models.
{ignore_list_string}
{accept_list_string}

ignore & accept string list

    ignore_list_string = ""
    ignore_list_string += "\n# Rejected Suggestions\n"
    ignore_list_string += "\nYou previously provided the following suggestions that the I rejected:\n"
    ignore_list_string += """\n- DO NOT SUGGEST: "I suggest changing the chat_model to "gpt-3.5-turbo" which is currently the latest GPT version in OpenAI's API, and will provide the best performance for this code review script."\n"""
    ignore_list_string += "\n".join([f"\n- DO NOT SUGGEST '{ignore}'.\n" for ignore in ignore_list])

    accept_list_string = ""
    if accept_list:
        accept_list_string += "\n# Accepted Suggestions\n"
        accept_list_string += "\nYou previously provided the following suggestions that I accepted. Unless it's critical, you probably shouldn't contradict these suggestions:\n"
        accept_list_string += "\n".join([f"\n- You previously suggested '{accepted}' which I accepted. Do not contradict yourself.\n" for accepted in accept_list])

messages

太長了，等會開 Notepad++ 觀看。

interactive_review

def main():
    parser = argparse.ArgumentParser(description="Automated code review using OpenAI API")
    parser.add_argument("filename", help="The target file to review")
    parser.add_argument("--model", default="gpt-4", help="The chat model to use for code review (default: gpt-4)")
    args = parser.parse_args()

    try:
        automated_code_review(args.filename, args.model)
    except KeyboardInterrupt:
        print("Exiting...")

if __name__ == "__main__":
    main()

老師整理的 code review prompt，我覺得很有價值。但因為太長，而且格式不利觀看，我重新整理後，放在 這個連結，必須登入論壇才能觀看（老師的智財，以避免被搜尋引擎收錄。這裡的共學夥伴都有上課，所以分享給大家參考）。