Gave instructions to correct citation on answer #929

maykcaldas · 2025-04-01T00:55:37Z

The LLM sometimes broke the pattern by using citation styles other than the citation key, which caused problems on the platform downstream. This PR gives more instructions on how the citation should be included in the answer.

A few examples of wrong citations:

1 - Task: What is the molecule known to have the greatest solubility in water?
Answer: (...) Similarly, studies on aqueous solubility in drug discovery (e.g., Ishikawa and Hashimoto, 2011) discuss extremely low solubility values for particular organic compounds, but these systems are not directly comparable to the gas-phase solubility measurements in Hildebrand’s work. (...)
Issue: Ishikawa and Hashimoto, 2011 is not a citation key

2 - Task: What is the molecule known to have the greatest solubility in water?
Answer: (...) Therefore, based solely on the provided context, ammonia (NH₃) is the molecule known to have the greatest solubility in water (hildebrand1916solubility pages 14–17).
Issue: * Hildebrand’s 1916 paper (pages 14–17) * and hildebrand1916solubility should be this paper. It should have used "hildebrand1916solubility. pages 17-19". The difference is the period.

We need to use whatever the citation key is and avoid the LLM from changing it

key should be used

Copilot

Pull Request Overview

This PR clarifies the required citation format by preventing the use of non-citation-key formats in answers.

Updates the test suite to reject answers containing a citation with a name and year
Enhances the prompt instructions to specify allowed citation formats and provide examples

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
tests/test_agents.py	Adds a test assertion to ensure answers contain valid citation keys
paperqa/prompts.py	Updates prompt instructions to enforce strict citation key formats

jamesbraza

In the PR description, can you provide the output that made you discover this issue, for posterity

tests/test_agents.py

paperqa/prompts.py

jamesbraza

🙏

jamesbraza · 2025-04-01T16:44:00Z

tests/test_agents.py

@@ -973,6 +973,30 @@ async def test_clinical_tool_usage(agent_test_settings) -> None:
    ), "No clinical trials were put into contexts"


+@pytest.mark.asyncio
+async def test_citation_formatting(agent_test_settings):


What do you think of collapsing this test into a subtest of test_agent_types, and it would actually:

Decrease CI runtime (since we're already running the agent in those tests)

Increase the coverage of the assertions, since more agents types are used there

jamesbraza · 2025-04-01T16:44:18Z

tests/test_agents.py

+    assert not re.search(
+        name_year_regex, response.session.answer
+    ), "Answer contains citation with name and year instead of citation key"


This assertion fails on current main branch, nice work here!! 🚀

…fix-citation

maykcaldas added 2 commits March 31, 2025 17:46

Specified how the citation

68f8124

key should be used

Added more information to prompt

67bb88e

Copilot AI review requested due to automatic review settings April 1, 2025 00:55

Copilot AI reviewed Apr 1, 2025

View reviewed changes

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Apr 1, 2025

Moved the formatting assert to a specific test

dbefde9

maykcaldas force-pushed the fix-citation branch from 1d9d37e to dbefde9 Compare April 1, 2025 01:11

maykcaldas marked this pull request as draft April 1, 2025 01:24

maykcaldas self-assigned this Apr 1, 2025

jamesbraza reviewed Apr 1, 2025

View reviewed changes

tests/test_agents.py Outdated Show resolved Hide resolved

paperqa/prompts.py Outdated Show resolved Hide resolved

paperqa/prompts.py Outdated Show resolved Hide resolved

Changed prompt and addressed PR comment

d07e673

jamesbraza approved these changes Apr 1, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 1, 2025

jamesbraza reviewed Apr 1, 2025

View reviewed changes

maykcaldas and others added 2 commits April 1, 2025 12:18

Made the citation constraints more explicit

6af30ea

Merge branch 'main' into fix-citation

f1a544d

maykcaldas marked this pull request as ready for review April 1, 2025 19:19

maykcaldas and others added 3 commits April 1, 2025 14:08

Updated some cassetes

d608767

Merge branch 'fix-citation' of github.com:Future-House/paper-qa into …

69c8c0d

…fix-citation

Merge branch 'main' into fix-citation

0e6458a

maykcaldas enabled auto-merge (squash) April 1, 2025 21:10

[pre-commit.ci lite] apply automatic fixes

615cc74

maykcaldas merged commit f9f1e36 into main Apr 1, 2025
5 checks passed

maykcaldas deleted the fix-citation branch April 1, 2025 21:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gave instructions to correct citation on answer #929

Gave instructions to correct citation on answer #929

Uh oh!

maykcaldas commented Apr 1, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

jamesbraza left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jamesbraza left a comment

Uh oh!

jamesbraza Apr 1, 2025

Uh oh!

jamesbraza Apr 1, 2025

Uh oh!

Uh oh!

Uh oh!

Gave instructions to correct citation on answer #929

Gave instructions to correct citation on answer #929

Uh oh!

Conversation

maykcaldas commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

jamesbraza left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jamesbraza left a comment

Choose a reason for hiding this comment

Uh oh!

jamesbraza Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

jamesbraza Apr 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

maykcaldas commented Apr 1, 2025 •

edited

Loading