CPAL #6255

borisdev · 2023-06-15T20:55:46Z

Causal program-aided language (CPAL) chain

Motivation

This builds on the recent Program-aided Language (PAL) approach to stop LLM hallucination. The problem with the PAL approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination.

For example, using the below word problem, PAL answers with 5, and CPAL answers with 13.

"Tim buys the same number of pets as Cindy and Boris."
"Cindy buys the same number of pets as Bill plus Bob."
"Boris buys the same number of pets as Ben plus Beth."
"Bill buys the same number of pets as Obama."
"Bob buys the same number of pets as Obama."
"Ben buys the same number of pets as Obama."
"Beth buys the same number of pets as Obama."
"If Obama buys one pet, how many pets total does everyone buy?"

The CPAL chain represents the causal structure of the above narrative as a causal graph or DAG, which it can also plot, as shown below.

.

The two major sections below are:

Technical overview
Future application

Also see this jupyter notebook doc.

1. Technical overview

CPAL versus PAL

Like PAL, CPAL intends to reduce large language model (LLM) hallucination.

The CPAL chain is different from the PAL chain for a couple of reasons.

CPAL adds a causal structure (or DAG) to link entity actions (or math expressions).
The CPAL math expressions are modeling a chain of cause and effect relations, which can be intervened upon, whereas for the PAL chain math expressions are projected math identities.

PAL's generated python code is wrong. It hallucinates when complexity increases.

def solution():
    """Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?"""
    obama_pets = 1
    tim_pets = obama_pets
    cindy_pets = obama_pets + obama_pets
    boris_pets = obama_pets + obama_pets
    total_pets = tim_pets + cindy_pets + boris_pets
    result = total_pets
    return result  # math result is 5

CPAL's generated python code is correct.

story outcome data
    name                                   code  value      depends_on
0  obama                                   pass    1.0              []
1   bill               bill.value = obama.value    1.0         [obama]
2    bob                bob.value = obama.value    1.0         [obama]
3    ben                ben.value = obama.value    1.0         [obama]
4   beth               beth.value = obama.value    1.0         [obama]
5  cindy   cindy.value = bill.value + bob.value    2.0     [bill, bob]
6  boris   boris.value = ben.value + beth.value    2.0     [ben, beth]
7    tim  tim.value = cindy.value + boris.value    4.0  [cindy, boris]

query data
{
    "question": "how many pets total does everyone buy?",
    "expression": "SELECT SUM(value) FROM df",
    "llm_error_msg": ""
}
# query result is 13

Based on the comments below, CPAL's intended location in the library is experimental/chains/cpal and PAL's location ischains/pal.

CPAL vs Graph QA

Both the CPAL chain and the Graph QA chain extract entity-action-entity relations into a DAG.

The CPAL chain is different from the Graph QA chain for a few reasons.

Graph QA does not connect entities to math expressions
Graph QA does not associate actions in a sequence of dependence.
Graph QA does not decompose the narrative into these three parts:
1. Story plot or causal model
2. Hypothetical question
3. Hypothetical condition

Evaluation

Preliminary evaluation on simple math word problems shows that this CPAL chain generates less hallucination than the PAL chain on answering questions about a causal narrative. Two examples are in this jupyter notebook doc.

2. Future application

"Describe as Narrative, Test as Code"

The thesis here is that the Describe as Narrative, Test as Code approach allows you to represent a causal mental model both as code and as a narrative, giving you the best of both worlds.

Why describe a causal mental mode as a narrative?

The narrative form is quick. At a consensus building meeting, people use narratives to persuade others of their causal mental model, aka. plan. You can share, version control and index a narrative.

Why test a causal mental model as a code?

Code is testable, complex narratives are not. Though fast, narratives are problematic as their complexity increases. The problem is LLMs and humans are prone to hallucination when predicting the outcomes of a narrative. The cost of building a consensus around the validity of a narrative outcome grows as its narrative complexity increases. Code does not require tribal knowledge or social power to validate.

Code is composable, complex narratives are not. The answer of one CPAL chain can be the hypothetical conditions of another CPAL Chain. For stochastic simulations, a composable plan can be integrated with the DoWhy library. Lastly, for the futuristic folk, a composable plan as code allows ordinary community folk to design a plan that can be integrated with a blockchain for funding.

An explanation of a dependency planning application is here.

Twitter handle: @boris_dev

vercel · 2023-06-15T20:55:49Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langchain	❌ Failed (Inspect)			Jul 11, 2023 0:11am

borisdev · 2023-06-15T21:27:52Z

@hwchase17
@vowelparrot

Question: Is it feasible for this PR of a proposed new CPAL chain to get into the library?

I suspect there is a chance it's not because of the following reasons.

This PR did not originate from a upstream github issue.
This PR did not originate from an Arvix academic article.

I am checking with you before I start working on the tests, and Jupyter notebook doc.

vowelparrot · 2023-06-16T01:39:42Z

This is really cool. I'll take a closer look at the code later tonight, but I absolutely think this is something we'd like to include in the library. If it's too complicated, then we could at least put in the experimental/ directory to start cc @hwchase17 @eyurtsev

ps I'm a big fan of the DoWhy library also btw.

hwchase17 · 2023-06-19T00:17:32Z

super interesting. yeah lets include in experimental for now

vercel · 2023-06-19T18:44:01Z

@borisdev is attempting to deploy a commit to the LangChain Team on Vercel.

A member of the Team first needs to authorize it.

borisdev · 2023-07-11T00:17:42Z

lint tests and unit tests pass

baskaryan · 2023-07-11T14:03:56Z

@borisdev this is awesome and looks pretty good to me!

vercel bot temporarily deployed to Preview June 15, 2023 20:56 Inactive

borisdev marked this pull request as draft June 15, 2023 21:19

vercel bot temporarily deployed to Preview June 15, 2023 22:44 Inactive

dev2049 self-requested a review June 21, 2023 08:16

borisdev closed this Jun 21, 2023

borisdev changed the title ~~Causal PAL~~ Causal PAL (experimental) Jun 21, 2023

borisdev reopened this Jun 21, 2023

vercel bot temporarily deployed to Preview June 21, 2023 23:22 Inactive

vercel bot temporarily deployed to Preview June 22, 2023 14:45 Inactive

borisdev closed this Jun 22, 2023

borisdev reopened this Jun 22, 2023

vercel bot temporarily deployed to Preview June 22, 2023 20:25 Inactive

vercel bot temporarily deployed to Preview June 23, 2023 01:40 Inactive

vercel bot temporarily deployed to Preview June 23, 2023 18:26 Inactive

borisdev changed the title ~~Causal PAL (experimental)~~ Causal PAL Jun 23, 2023

borisdev changed the title ~~Causal PAL~~ CPAL Jun 23, 2023

vercel bot temporarily deployed to Preview June 25, 2023 00:49 Inactive

vercel bot temporarily deployed to Preview June 25, 2023 21:39 Inactive

vercel bot temporarily deployed to Preview June 25, 2023 21:46 Inactive

vercel bot temporarily deployed to Preview June 26, 2023 19:28 Inactive

vercel bot temporarily deployed to Preview June 27, 2023 01:02 Inactive

vercel bot temporarily deployed to Preview June 27, 2023 21:04 Inactive

vercel bot temporarily deployed to Preview June 28, 2023 00:47 Inactive

vercel bot temporarily deployed to Preview June 28, 2023 01:11 Inactive

vercel bot temporarily deployed to Preview June 28, 2023 01:22 Inactive

borisdev closed this Jun 28, 2023

borisdev reopened this Jun 28, 2023

vercel bot temporarily deployed to Preview June 28, 2023 20:44 Inactive

vercel bot temporarily deployed to Preview June 29, 2023 19:31 Inactive

vercel bot temporarily deployed to Preview July 1, 2023 00:41 Inactive

vercel bot temporarily deployed to Preview July 1, 2023 23:51 Inactive

vercel bot temporarily deployed to Preview July 2, 2023 18:24 Inactive

borisdev marked this pull request as ready for review July 2, 2023 18:45

vercel bot temporarily deployed to Preview July 2, 2023 18:47 Inactive

borisdev closed this Jul 2, 2023

borisdev reopened this Jul 2, 2023

vercel bot temporarily deployed to Preview July 2, 2023 20:40 Inactive

vercel bot temporarily deployed to Preview July 6, 2023 19:46 Inactive

vercel bot temporarily deployed to Preview July 6, 2023 20:44 Inactive

vercel bot temporarily deployed to Preview July 7, 2023 00:37 Inactive

vercel bot temporarily deployed to Preview July 8, 2023 17:49 Inactive

borisdev closed this Jul 8, 2023

CPAL, lint and unit tests fixed

d3713f7

borisdev reopened this Jul 10, 2023

vercel bot temporarily deployed to Preview July 10, 2023 18:22 Inactive

CPAL, lower case examples in unit test

c6b90a6

vercel bot temporarily deployed to Preview July 10, 2023 18:51 Inactive

CPAL, isort

299aa22

vercel bot temporarily deployed to Preview July 10, 2023 19:00 Inactive

CPAL, black reformat

d95ed59

vercel bot temporarily deployed to Preview July 11, 2023 00:11 Inactive

baskaryan merged commit 9129318 into langchain-ai:master Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CPAL #6255

CPAL #6255

Uh oh!

borisdev commented Jun 15, 2023 •

edited

Loading

Uh oh!

vercel bot commented Jun 15, 2023 •

edited

Loading

Uh oh!

borisdev commented Jun 15, 2023 •

edited

Loading

Uh oh!

vowelparrot commented Jun 16, 2023

Uh oh!

hwchase17 commented Jun 19, 2023

Uh oh!

vercel bot commented Jun 19, 2023

Uh oh!

borisdev commented Jul 11, 2023

Uh oh!

baskaryan commented Jul 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CPAL #6255

CPAL #6255

Uh oh!

Conversation

borisdev commented Jun 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Causal program-aided language (CPAL) chain

Motivation

1. Technical overview

CPAL versus PAL

CPAL vs Graph QA

Evaluation

2. Future application

"Describe as Narrative, Test as Code"

Why describe a causal mental mode as a narrative?

Why test a causal mental model as a code?

Uh oh!

vercel bot commented Jun 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

borisdev commented Jun 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vowelparrot commented Jun 16, 2023

Uh oh!

hwchase17 commented Jun 19, 2023

Uh oh!

vercel bot commented Jun 19, 2023

Uh oh!

borisdev commented Jul 11, 2023

Uh oh!

baskaryan commented Jul 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

borisdev commented Jun 15, 2023 •

edited

Loading

vercel bot commented Jun 15, 2023 •

edited

Loading

borisdev commented Jun 15, 2023 •

edited

Loading