Skip to content

Conversation

@borisdev
Copy link
Contributor

@borisdev borisdev commented Jun 15, 2023

Causal program-aided language (CPAL) chain

Motivation

This builds on the recent Program-aided Language (PAL) approach to stop LLM hallucination. The problem with the PAL approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination.

For example, using the below word problem, PAL answers with 5, and CPAL answers with 13.

"Tim buys the same number of pets as Cindy and Boris."
"Cindy buys the same number of pets as Bill plus Bob."
"Boris buys the same number of pets as Ben plus Beth."
"Bill buys the same number of pets as Obama."
"Bob buys the same number of pets as Obama."
"Ben buys the same number of pets as Obama."
"Beth buys the same number of pets as Obama."
"If Obama buys one pet, how many pets total does everyone buy?"

The CPAL chain represents the causal structure of the above narrative as a causal graph or DAG, which it can also plot, as shown below.

complex-graph

.

The two major sections below are:

  1. Technical overview
  2. Future application

Also see this jupyter notebook doc.

1. Technical overview

CPAL versus PAL

Like PAL, CPAL intends to reduce large language model (LLM) hallucination.

The CPAL chain is different from the PAL chain for a couple of reasons.

  • CPAL adds a causal structure (or DAG) to link entity actions (or math expressions).
  • The CPAL math expressions are modeling a chain of cause and effect relations, which can be intervened upon, whereas for the PAL chain math expressions are projected math identities.

PAL's generated python code is wrong. It hallucinates when complexity increases.

def solution():
    """Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?"""
    obama_pets = 1
    tim_pets = obama_pets
    cindy_pets = obama_pets + obama_pets
    boris_pets = obama_pets + obama_pets
    total_pets = tim_pets + cindy_pets + boris_pets
    result = total_pets
    return result  # math result is 5

CPAL's generated python code is correct.

story outcome data
    name                                   code  value      depends_on
0  obama                                   pass    1.0              []
1   bill               bill.value = obama.value    1.0         [obama]
2    bob                bob.value = obama.value    1.0         [obama]
3    ben                ben.value = obama.value    1.0         [obama]
4   beth               beth.value = obama.value    1.0         [obama]
5  cindy   cindy.value = bill.value + bob.value    2.0     [bill, bob]
6  boris   boris.value = ben.value + beth.value    2.0     [ben, beth]
7    tim  tim.value = cindy.value + boris.value    4.0  [cindy, boris]

query data
{
    "question": "how many pets total does everyone buy?",
    "expression": "SELECT SUM(value) FROM df",
    "llm_error_msg": ""
}
# query result is 13

Based on the comments below, CPAL's intended location in the library is experimental/chains/cpal and PAL's location ischains/pal.

CPAL vs Graph QA

Both the CPAL chain and the Graph QA chain extract entity-action-entity relations into a DAG.

The CPAL chain is different from the Graph QA chain for a few reasons.

  • Graph QA does not connect entities to math expressions
  • Graph QA does not associate actions in a sequence of dependence.
  • Graph QA does not decompose the narrative into these three parts:
    1. Story plot or causal model
    2. Hypothetical question
    3. Hypothetical condition

Evaluation

Preliminary evaluation on simple math word problems shows that this CPAL chain generates less hallucination than the PAL chain on answering questions about a causal narrative. Two examples are in this jupyter notebook doc.

2. Future application

"Describe as Narrative, Test as Code"

The thesis here is that the Describe as Narrative, Test as Code approach allows you to represent a causal mental model both as code and as a narrative, giving you the best of both worlds.

Why describe a causal mental mode as a narrative?

The narrative form is quick. At a consensus building meeting, people use narratives to persuade others of their causal mental model, aka. plan. You can share, version control and index a narrative.

Why test a causal mental model as a code?

Code is testable, complex narratives are not. Though fast, narratives are problematic as their complexity increases. The problem is LLMs and humans are prone to hallucination when predicting the outcomes of a narrative. The cost of building a consensus around the validity of a narrative outcome grows as its narrative complexity increases. Code does not require tribal knowledge or social power to validate.

Code is composable, complex narratives are not. The answer of one CPAL chain can be the hypothetical conditions of another CPAL Chain. For stochastic simulations, a composable plan can be integrated with the DoWhy library. Lastly, for the futuristic folk, a composable plan as code allows ordinary community folk to design a plan that can be integrated with a blockchain for funding.

An explanation of a dependency planning application is here.


Twitter handle: @boris_dev

@vercel
Copy link

vercel bot commented Jun 15, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ❌ Failed (Inspect) Jul 11, 2023 0:11am

@vercel vercel bot temporarily deployed to Preview June 15, 2023 20:56 Inactive
@borisdev borisdev marked this pull request as draft June 15, 2023 21:19
@borisdev
Copy link
Contributor Author

borisdev commented Jun 15, 2023

@hwchase17
@vowelparrot

Question: Is it feasible for this PR of a proposed new CPAL chain to get into the library?

I suspect there is a chance it's not because of the following reasons.

  • This PR did not originate from a upstream github issue.
  • This PR did not originate from an Arvix academic article.

I am checking with you before I start working on the tests, and Jupyter notebook doc.

@vercel vercel bot temporarily deployed to Preview June 15, 2023 22:44 Inactive
@vowelparrot
Copy link
Contributor

This is really cool. I'll take a closer look at the code later tonight, but I absolutely think this is something we'd like to include in the library. If it's too complicated, then we could at least put in the experimental/ directory to start cc @hwchase17 @eyurtsev

ps I'm a big fan of the DoWhy library also btw.

@hwchase17
Copy link
Contributor

super interesting. yeah lets include in experimental for now

@vercel
Copy link

vercel bot commented Jun 19, 2023

@borisdev is attempting to deploy a commit to the LangChain Team on Vercel.

A member of the Team first needs to authorize it.

@dev2049 dev2049 self-requested a review June 21, 2023 08:16
@borisdev borisdev closed this Jun 21, 2023
@borisdev borisdev changed the title Causal PAL Causal PAL (experimental) Jun 21, 2023
@borisdev borisdev reopened this Jun 21, 2023
@vercel vercel bot temporarily deployed to Preview June 21, 2023 23:22 Inactive
@vercel vercel bot temporarily deployed to Preview June 22, 2023 14:45 Inactive
@borisdev borisdev closed this Jun 22, 2023
@borisdev borisdev reopened this Jun 22, 2023
@vercel vercel bot temporarily deployed to Preview June 22, 2023 20:25 Inactive
@vercel vercel bot temporarily deployed to Preview June 23, 2023 01:40 Inactive
@vercel vercel bot temporarily deployed to Preview June 23, 2023 18:26 Inactive
@borisdev borisdev changed the title Causal PAL (experimental) Causal PAL Jun 23, 2023
@borisdev borisdev changed the title Causal PAL CPAL Jun 23, 2023
@vercel vercel bot temporarily deployed to Preview June 25, 2023 00:49 Inactive
@vercel vercel bot temporarily deployed to Preview June 25, 2023 21:39 Inactive
@vercel vercel bot temporarily deployed to Preview June 25, 2023 21:46 Inactive
@vercel vercel bot temporarily deployed to Preview June 26, 2023 19:28 Inactive
@vercel vercel bot temporarily deployed to Preview June 27, 2023 01:02 Inactive
@vercel vercel bot temporarily deployed to Preview June 27, 2023 21:04 Inactive
@vercel vercel bot temporarily deployed to Preview June 28, 2023 00:47 Inactive
@vercel vercel bot temporarily deployed to Preview June 28, 2023 01:11 Inactive
@vercel vercel bot temporarily deployed to Preview June 28, 2023 01:22 Inactive
@borisdev borisdev closed this Jun 28, 2023
@borisdev borisdev reopened this Jun 28, 2023
@vercel vercel bot temporarily deployed to Preview June 28, 2023 20:44 Inactive
@vercel vercel bot temporarily deployed to Preview July 1, 2023 00:41 Inactive
@vercel vercel bot temporarily deployed to Preview July 1, 2023 23:51 Inactive
@vercel vercel bot temporarily deployed to Preview July 2, 2023 18:24 Inactive
@borisdev borisdev marked this pull request as ready for review July 2, 2023 18:45
@vercel vercel bot temporarily deployed to Preview July 2, 2023 18:47 Inactive
@borisdev borisdev closed this Jul 2, 2023
@borisdev borisdev reopened this Jul 2, 2023
@vercel vercel bot temporarily deployed to Preview July 2, 2023 20:40 Inactive
@vercel vercel bot temporarily deployed to Preview July 6, 2023 19:46 Inactive
@vercel vercel bot temporarily deployed to Preview July 6, 2023 20:44 Inactive
@vercel vercel bot temporarily deployed to Preview July 7, 2023 00:37 Inactive
@vercel vercel bot temporarily deployed to Preview July 8, 2023 17:49 Inactive
@borisdev borisdev closed this Jul 8, 2023
@borisdev borisdev reopened this Jul 10, 2023
@vercel vercel bot temporarily deployed to Preview July 10, 2023 18:22 Inactive
@vercel vercel bot temporarily deployed to Preview July 10, 2023 18:51 Inactive
@vercel vercel bot temporarily deployed to Preview July 10, 2023 19:00 Inactive
@vercel vercel bot temporarily deployed to Preview July 11, 2023 00:11 Inactive
@borisdev
Copy link
Contributor Author

lint tests and unit tests pass

Screenshot 2023-07-10 at 5 15 22 PM

@baskaryan
Copy link
Collaborator

@borisdev this is awesome and looks pretty good to me!

@baskaryan baskaryan merged commit 9129318 into langchain-ai:master Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants