A system for efficiently reviewing code and blogs: Introducing PR-Agent (Amazon Bedrock Claude3)
Hello. I'm @hoshino from the DBRE team.
In the DBRE (Database Reliability Engineering) team, our cross-functional efforts are dedicated to addressing challenges such as resolving database-related issues and developing platforms that effectively balance governance with agility within our organization. DBRE is a relatively new concept, so very few companies have dedicated organizations to address it. Even among those that do, there is often a focus on different aspects and varied approaches. This makes DBRE an exceptionally captivating field, constantly evolving and developing.
For more information on the background of the DBRE team and its role at KINTO Technologies, please see our Tech Blog article, The need for DBRE in KTC.
In this article, I will introduce the improvements the DBRE team experienced after integrating PR-Agent into our repositories. I will also explain how adjusting the prompts allows PR-Agent to review non-code documents, such as tech blogs. I hope this information is helpful.
What is PR-Agent?
PR-Agent is an open source software (OSS) developed by Codium AI, designed to streamline the software development process and improve code quality through automation. Its main goal is to automate the initial review of Pull Requests (PR) and reduce the amount of time developers spend on code reviews. This automation also provides quick feedback, which can accelerate the development process. Another feature that stands out from other tools is the wide range of language models available.
PR-Agent has multiple functions (commands), and developers can select which functions to apply to each PR. The main functions are as follows:
- Review:
- Evaluates the quality of the code and identifies issues
- Describe:
- Summarizes the changes made in the Pull Request and automatically generates an overview
- Improve:
- Suggests improvements for the added or modified code in the Pull Requests
- Ask:
- Allows developers to interact with the AI in a comment format on the Pull Requests, addressing questions or concerns about the PR.
For more details, please refer to the official documentation.
Why we integrated PR-Agent
The DBRE team had been working on a Proof of Concept (PoC) for a schema review system that utilizes AI. During the process, we evaluated various tools that offer review functionalities based on the following criteria:
- Input criteria:
- Ability to review database schemas based on the KIC’s Database Schema Design Guidelines
- Ability to customize inputs to the LMM to enhance response accuracy (e.g., integrating chains or custom functions)
- Output Criteria:
- To output review results to GitHub, we evaluated whether the following conditions could be met based on the outputs from the LLM:
- Ability to trigger reviews via PRs
- Ability to comment on PRs
- Ability to use AI-generated outputs to comment on the code (schema information) in PRs
- Ability to suggest corrections at the code level
- To output review results to GitHub, we evaluated whether the following conditions could be met based on the outputs from the LLM:
Despite our thorough investigation, we couldn’t find a tool that fully met our input requirements.
However, during our evaluation, we decided to experiment with one of the AI review tools used internally in DBRE team, leading to the adoption of PR-Agent.
The main reasons for choosing PR-Agent among the tools we surveyed, are as follows:
- Open source software (OSS)
- Possible to implement it while keeping costs down
- Supports various language models
- It supports a variety of language models, and you can select the appropriate language model to suit your needs.
- Ease of implementation and customization
- PR-Agent was relatively easy to implement and offered flexible settings and customization options, allowing us to optimize it for our specific requirements and workflows.
For this project, we used Amazon Bedrock. The reasons for using it are as follows:
- Since KTC mainly uses AWS, we decided to try Bedrock first because it allows for quick and seamless integration.
- Compared to OpenAI's GPT-4, using Claude3 Sonnet through Bedrock reduced costs to about one-tenth.
For these reasons, we integrated PR-Agent into the DBRE team's repository.
Customizations implemented during PR-Agent integration
Primarily, we followed the steps outlined in the official documentation for the integration. In this article, we’ll detail the specific customizations we made.
Using Amazon Bedrock Claude3
We utilized the Amazon Bedrock Claude3-sonnet language model. Although the official documentation recommends using access key authentication, we opted for ARN-based authentication to comply with our internal security policies.
- name: Input AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN_PR_REVIEW }}
aws-region: ${{ secrets.AWS_REGION_PR_REVIEW }}
Manage prompts in GitHub Wiki
Since the DBRE team runs multiple repositories, it was necessary to centralize prompts references. After integrating PR-Agent, we also wanted team members to easily edit and fine-tune prompts.
That’s when we considered using GitHub Wiki.
GitHub Wiki tracks changes and makes it easy for anyone to edit. So we thought that by using it, team members would be able to easily change the prompt.
In PR-Agent, you can set extra instructions for each function such as describe through the extra_instructions field in GitHub Actions. (Official documentation)
#Here are excerpts from the configuration.toml
[pr_reviewer] # /review #
extra_instructions = "" # Add extra instructions here
[pr_description] # /describe #
extra_instructions = ""
[pr_code_suggestions] # /improve #
extra_instructions = ""
Therefore, we customized the setup to dynamically add extra instructions (prompts) listed in the GitHub Wiki through variables in the GitHub Actions where PR-Agent is configured.
Here are the configuration steps:
First, generate a token using any GitHub account and clone the Wiki repository using GitHub Actions.
- name: Checkout the Wiki repository
uses: actions/checkout@v4
with:
ref: main # Specify any branch (GitHub defaults is master)
repository: {repo}/{path}.wiki
path: wiki
token: ${{ secrets.GITHUB_TOKEN_Foobar }}
Next, set the information from the Wiki as environment variables. Read the contents of the file and set the prompts as environment variables.
- name: Set up Wiki Info
id: wiki_info
run: |
set_env_var_from_file() {
local var_name=$1
local file_path=$2
local prompt=$(cat "$file_path")
echo "${var_name}<<EOF" >> $GITHUB_ENV
echo "prompt" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
}
set_env_var_from_file "REVIEW_PROMPT" "./wiki/pr-agent-review-prompt.md"
set_env_var_from_file "DESCRIBE_PROMPT" "./wiki/pr-agent-describe-prompt.md"
set_env_var_from_file "IMPROVE_PROMPT" "./wiki/pr-agent-improve-prompt.md"
Finally, configure the action steps for the PR-Agent. Read the content of each prompt from the environment variables.
- name: PR Agent action step
id: Pragent
uses: Codium-ai/pr-agent@main
env:
# model settings
CONFIG.MODEL: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
CONFIG.MODEL_TURBO: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
CONFIG.FALLBACK_MODEL: bedrock/anthropic.claude-v2:1
LITELLM.DROP_PARAMS: true
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
AWS.BEDROCK_REGION: us-west-2
# PR_AGENT settings (/review)
PR_REVIEWER.extra_instructions: |
${{env.REVIEW_PROMPT}}
# PR_DESCRIPTION settings (/describe)
PR_DESCRIPTION.extra_instructions: |
${{env.DESCRIBE_PROMPT}}
# PR_CODE_SUGGESTIONS settings (/improve)
PR_CODE_SUGGESTIONS.extra_instructions: |
${{env.IMPROVE_PROMPT}}
By following the steps outlined above, you can pass the prompts listed on the Wiki to PR-Agent and execute them.
What we did to expand review targets to include tech blogs
Our company’s tech blogs are managed in a Git repository, which led to the idea of using PR-Agent to review blog articles like code.
Typically, PR-Agent is a tool specialized for code reviews. The Describe and Review functions worked somewhat when we tested it on blog articles. Still, the Improve function only returned "No code suggestions found for PR," even after adjusting the prompts (extra_instructions).(This behavior likely occurred because PR-Agent is designed primarily for code review.)
To address this, we tested whether customizing the System prompt for the Improve function would enable it to review blog articles. After customization, we received responses from the AI, so we also decided to proceed with customizing the system prompts.
System prompt refers to a prompt that is passed separately from the user prompt when invoking LLM. It also includes specific instructions on the items to be output and the format. The extra_instructions that I explained earlier are part of the system prompt, and it appears that if the user provides additional instructions in PR-Agent, those instructions are incorporated into the system prompt.
# Here are the excerpts from the system prompt for Improve
[pr_code_suggestions_prompt]
system="""You are PR-Reviewer, a language model that specializes in suggesting ways to improve for a Pull Request (PR) code.
Your task is to provide meaningful and actionable code suggestions, to improve the new code presented in a PR diff.
omission
{%- if extra_instructions %}
Extra instructions from the user, that should be taken into account with high priority:
======
{{ extra_instructions }} # Add the content specified in the extra_instructions.
======
{%- endif %}
omission
PR-Agent allows you to edit system prompts from GitHub Actions, just like extra_instructions.
By customizing the existing system prompts, we expanded the review capabilities to include not only code but also text.
Below are some examples of our customizations:
First, we modified the instructions specific to the code so they could be used to review tech blogs.
System prompt before customization
You are PR-Reviewer, a language model that specializes in suggesting ways to improve for a Pull Request (PR) code.
Your task is to provide meaningful and actionable code suggestions, to improve the new code presented in a PR diff.
# Japanese translation
# あなたは PR-Reviewer で、Pull Request (PR) のコードを改善する方法を提案することに特化した言語モデルです。
# あなたのタスクは、PR diffで提示された新しいコードを改善するために、有意義で実行可能なコード提案を提供することです。
System prompt after customization
You are a reviewer for an IT company's tech blog.
Your role is to review the contents of .md files in terms of the following.
Please review each item listed as a checkpoint and identify any issues.
# Japanese translation
# あなたはIT企業の技術ブログのレビュアーです。
# あなたの役割は、.mdファイルの内容を以下の観点からレビューすることです。
# チェックポイントとして示されている各項目を確認し、問題があれば指摘してください。
Next, we will modify the section with specific instructions so that you can review the tech blog. Changing the instructions regarding the output would affect the program, so we have customized it so that tech blogs can be reviewed by simply replacing code review instructions with text.
System prompt before customization
Specific instructions for generating code suggestions:
- Provide up to {{ num_code_suggestions }} code suggestions. The suggestions should be diverse and insightful.
- The suggestions should focus on ways to improve the new code in the PR, meaning focusing on lines from '__new hunk__' sections, starting with '+'. Use the '__old hunk__' sections to understand the context of the code changes.
- Prioritize suggestions that address possible issues, major problems, and bugs in the PR code.
- Don't suggest to add docstring, type hints, or comments, or to remove unused imports.
- Suggestions should not repeat code already present in the '__new hunk__' sections.
- Provide the exact line numbers range (inclusive) for each suggestion. Use the line numbers from the '__new hunk__' sections.
- When quoting variables or names from the code, use backticks (`) instead of single quote (').
- Take into account that you are reviewing a PR code diff, and that the entire codebase is not available for you as context. Hence, avoid suggestions that might conflict with unseen parts of the codebase.
System prompt after customization
Specific instructions for generating text suggestions:
- Provide up to {{ num_code_suggestions }} text suggestions. The suggestions should be diverse and insightful.
- The suggestions should focus on ways to improve the new text in the PR, meaning focusing on lines from '__new hunk__' sections, starting with '+'. Use the '__old hunk__' sections to understand the context of the code changes.
- Prioritize suggestions that address possible issues, major problems, and bugs in the PR text.
- Don't suggest to add docstring, type hints, or comments, or to remove unused imports.
- Suggestions should not repeat text already present in the '__new hunk__' sections.
- Provide the exact line numbers range (inclusive) for each suggestion. Use the line numbers from the '__new hunk__' sections.
- When quoting variables or names from the text, use backticks (`) instead of single quote (').
After that, add a new Wiki for system prompts, following the steps in "Managing prompts in a Wiki" explained earlier.
- name: Set up Wiki Info
id: wiki_info
run: |
set_env_var_from_file() {
local var_name=$1
local file_path=$2
local prompt=$(cat "$file_path")
echo "${var_name}<<EOF" >> $GITHUB_ENV
echo "prompt" >> $GITHUB_ENV
echo "EOF" >> $GITHUB_ENV
}
set_env_var_from_file "REVIEW_PROMPT" "./wiki/pr-agent-review-prompt.md"
set_env_var_from_file "DESCRIBE_PROMPT" "./wiki/pr-agent-describe-prompt.md"
set_env_var_from_file "IMPROVE_PROMPT" "./wiki/pr-agent-improve-prompt.md"
+ set_env_var_from_file "IMPROVE_SYSTEM_PROMPT" "./wiki/pr-agent-improve-system-prompt.md"
- name: PR Agent action step
omission
+ PR_CODE_SUGGESTIONS_PROMPT.system: |
+ ${{env.IMPROVE_SYSTEM_PROMPT}}
By following the steps outlined above, we customized PR-Agent’s Improve function, which typically specializes in code review, to support the review of blog articles.
However, it’s important to note that the responses may not always be 100% as expected, even after modifying the system prompt. This is also true when using the Improve function for program code.
Results of installing PR-Agent
Implementing PR-Agent has brought the following benefits:
-
Improved review accuracy
- It highlights issues we often overlook, improving the accuracy of our code reviews.
- It allows for the review of past closed PRs, providing opportunities to reflect on older code.
- Reviewing past PRs helps us continually enhance the quality and integrity of our codebase.
-
Reduced burden of creating pull requests (PRs)
- The pull request summary feature makes creating pull requests easier.
- Reviewers can quickly see the summary, improving review efficiency and shortening merge times.
-
Improved engineering skills
- Keeping up with rapid technological advances while managing daily duties can be challenging.
- The AI’s suggestions have been very effective in learning best practices.
-
Tech Blog Reviews
- Implementing PR-Agent to our tech blog has reduced the burden of reviews. Although it's not perfect, it checks your articles for spelling mistakes, grammar issues, and consistency of content and logic, helping us find errors that are easy to overlook.
Below is an example of a review of an actual tech blog (Event Report DBRE Summit 2023).
Summary of thePull Request (PR) for the tech blog by PR-Agent (Describe)
Review of the Pull Request (PR) for the tech blog by PR-Agent (Review)
Proposed changes to the tech blog by PR-Agent (Improve)
It is also important to note that a human being must make the final decision for the following reasons:
- The PR-Agent’s review results for the exact same Pull Requests (PR) can vary each time, and the accuracy of the responses can be inconsistent.
- PR-Agent reviews may generate irrelevant or completely off-target feedback
Conclusion
In this article, we introduced how the implementation and customization of PR-Agent have improved work efficiency. While complete review automation is not yet possible, through configuration and customization, PR-Agent plays a supportive role in enhancing the productivity of our development teams. We aim to continue using PR-Agent to improve efficiency and productivity further.
関連記事 | Related Posts
We are hiring!
【DBRE】DBRE G/東京・名古屋・大阪
DBREグループについてKINTO テクノロジーズにおける DBRE は横断組織です。自分たちのアウトプットがビジネスに反映されることによって価値提供されます。
【プラットフォームエンジニア】プラットフォームG/東京・大阪
プラットフォームグループについてAWS を中心とするインフラ上で稼働するアプリケーション運用改善のサポートを担当しています。