Skip to content

Generating Synthetic Code Review Comments from Bug Fix Histories #5037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

leusonmario
Copy link
Collaborator

Goal: Automatically generate synthetic code review comments by leveraging historical bug reports and their corresponding fixes.
Input: A list of previously reported bugs along with their associated fix implementations (commits).

@marco-c
Copy link
Collaborator

marco-c commented Jun 4, 2025

CC @olewicki

@suhaibmujahid
Copy link
Member

The diff was difficult to review because it included 84 commits from the upstream. I cleaned up the branch and rebased it.

Comment on lines 42 to 76
You are an expert reviewer for source code with extensive experience in analyzing and summarizing code changes.

The bug associated with patch_bug was introduced and later fixed. Below, you can find further information about the fix.
Fix title: {fix_title}
Fix description: {fix_description}

Your task:
Analyze the provided code and generate a concise summary focusing on the exact changes in patch_bug that introduced the issue and how patch_fix resolved it. Ignore any modifications unrelated to the bug fix.

You must report:
1. The root cause of the issue in `patch_bug`: Identify the specific code lines in patch_bug responsible for the bug. Report the exact affected line and explain why they led to the issue. One single line number for change.
2. The specific changes in `patch_fix` that correct the issue: Explain how the bug was resolved, but keep the focus on mapping fixes back to the faulty lines in `patch_bug`.

Output Format:
Provide a structured response that explicitly maps faulty lines in `patch_bug` to the fix in `patch_fix`, like this:

{{
"root_cause": {{
"filename": "<file_path>",
"line": [<line_number>],
"explanation": "<Why these lines introduced the bug>"
}},
"fix": {{
"filename": "<file_path>",
"line": [<line_number>],
"explanation": "<How these changes in patch_fix resolved the issue>"
}}
}}

Bug commit message: {bug_commit_message}
{patch_bug}

Fix commit message: {fix_commit_message}
{patch_fix}
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we need to indent the whole prompt.

Suggested change
You are an expert reviewer for source code with extensive experience in analyzing and summarizing code changes.
The bug associated with patch_bug was introduced and later fixed. Below, you can find further information about the fix.
Fix title: {fix_title}
Fix description: {fix_description}
Your task:
Analyze the provided code and generate a concise summary focusing on the exact changes in patch_bug that introduced the issue and how patch_fix resolved it. Ignore any modifications unrelated to the bug fix.
You must report:
1. The root cause of the issue in `patch_bug`: Identify the specific code lines in patch_bug responsible for the bug. Report the exact affected line and explain why they led to the issue. One single line number for change.
2. The specific changes in `patch_fix` that correct the issue: Explain how the bug was resolved, but keep the focus on mapping fixes back to the faulty lines in `patch_bug`.
Output Format:
Provide a structured response that explicitly maps faulty lines in `patch_bug` to the fix in `patch_fix`, like this:
{{
"root_cause": {{
"filename": "<file_path>",
"line": [<line_number>],
"explanation": "<Why these lines introduced the bug>"
}},
"fix": {{
"filename": "<file_path>",
"line": [<line_number>],
"explanation": "<How these changes in patch_fix resolved the issue>"
}}
}}
Bug commit message: {bug_commit_message}
{patch_bug}
Fix commit message: {fix_commit_message}
{patch_fix}
"""
You are an expert reviewer for source code with extensive experience in analyzing and summarizing code changes.
The bug associated with patch_bug was introduced and later fixed. Below, you can find further information about the fix.
Fix title: {fix_title}
Fix description: {fix_description}
Your task:
Analyze the provided code and generate a concise summary focusing on the exact changes in patch_bug that introduced the issue and how patch_fix resolved it. Ignore any modifications unrelated to the bug fix.
You must report:
1. The root cause of the issue in `patch_bug`: Identify the specific code lines in patch_bug responsible for the bug. Report the exact affected line and explain why they led to the issue. One single line number for change.
2. The specific changes in `patch_fix` that correct the issue: Explain how the bug was resolved, but keep the focus on mapping fixes back to the faulty lines in `patch_bug`.
Output Format:
Provide a structured response that explicitly maps faulty lines in `patch_bug` to the fix in `patch_fix`, like this:
{{
"root_cause": {{
"filename": "<file_path>",
"line": [<line_number>],
"explanation": "<Why these lines introduced the bug>"
}},
"fix": {{
"filename": "<file_path>",
"line": [<line_number>],
"explanation": "<How these changes in patch_fix resolved the issue>"
}}
}}
Bug commit message: {bug_commit_message}
{patch_bug}
Fix commit message: {fix_commit_message}
{patch_fix}
"""

{patch_fix}
"""

FILTERING_COMMENTS = """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{bug_summarization}
"""

CODE_GEN_BUG_FIX = """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subcategories include:
* None of the above
* Does not apply
- Keep It Focused: Limit your comments to the issues that could lead to problems identified by the Jira ticket and are directly related to the changes made in the Patch fixing the bug.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make it generic instead of specifying Jira here.

suhaibmujahid and others added 2 commits July 15, 2025 17:24
Replaces the incorrect use of bugzilla.COMMITS_DB with repository.COMMITS_DB in the documentation example to ensure the correct database is downloaded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants