Skip to content

Commit 4cc7808

Browse files
hwchase17wey-guchenweisomebody126
authored andcommitted
Harrison/nebula graph (langchain-ai#5865)
Co-authored-by: Wey Gu <[email protected]> Co-authored-by: chenweisomebody <[email protected]>
1 parent b3dc57d commit 4cc7808

File tree

9 files changed

+712
-4
lines changed

9 files changed

+712
-4
lines changed
Lines changed: 270 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,270 @@
1+
{
2+
"cells": [
3+
{
4+
"attachments": {},
5+
"cell_type": "markdown",
6+
"id": "c94240f5",
7+
"metadata": {},
8+
"source": [
9+
"# NebulaGraphQAChain\n",
10+
"\n",
11+
"This notebook shows how to use LLMs to provide a natural language interface to NebulaGraph database."
12+
]
13+
},
14+
{
15+
"attachments": {},
16+
"cell_type": "markdown",
17+
"id": "dbc0ee68",
18+
"metadata": {},
19+
"source": [
20+
"You will need to have a running NebulaGraph cluster, for which you can run a containerized cluster by running the following script:\n",
21+
"\n",
22+
"```bash\n",
23+
"curl -fsSL nebula-up.siwei.io/install.sh | bash\n",
24+
"```\n",
25+
"\n",
26+
"Other options are:\n",
27+
"- Install as a [Docker Desktop Extension](https://www.docker.com/blog/distributed-cloud-native-graph-database-nebulagraph-docker-extension/). See [here](https://docs.nebula-graph.io/3.5.0/2.quick-start/1.quick-start-workflow/)\n",
28+
"- NebulaGraph Cloud Service. See [here](https://www.nebula-graph.io/cloud)\n",
29+
"- Deploy from package, source code, or via Kubernetes. See [here](https://docs.nebula-graph.io/)\n",
30+
"\n",
31+
"Once the cluster is running, we could create the SPACE and SCHEMA for the database."
32+
]
33+
},
34+
{
35+
"cell_type": "code",
36+
"execution_count": null,
37+
"id": "c82f4141",
38+
"metadata": {},
39+
"outputs": [],
40+
"source": [
41+
"%pip install ipython-ngql\n",
42+
"%load_ext ngql\n",
43+
"\n",
44+
"# connect ngql jupyter extension to nebulagraph\n",
45+
"%ngql --address 127.0.0.1 --port 9669 --user root --password nebula\n",
46+
"# create a new space\n",
47+
"%ngql CREATE SPACE IF NOT EXISTS langchain(partition_num=1, replica_factor=1, vid_type=fixed_string(128));\n"
48+
]
49+
},
50+
{
51+
"cell_type": "code",
52+
"execution_count": null,
53+
"id": "eda0809a",
54+
"metadata": {},
55+
"outputs": [],
56+
"source": [
57+
"# Wait for a few seconds for the space to be created.\n",
58+
"%ngql USE langchain;"
59+
]
60+
},
61+
{
62+
"attachments": {},
63+
"cell_type": "markdown",
64+
"id": "119fe35c",
65+
"metadata": {},
66+
"source": [
67+
"Create the schema, for full dataset, refer [here](https://www.siwei.io/en/nebulagraph-etl-dbt/)."
68+
]
69+
},
70+
{
71+
"cell_type": "code",
72+
"execution_count": null,
73+
"id": "5aa796ee",
74+
"metadata": {},
75+
"outputs": [],
76+
"source": [
77+
"%%ngql\n",
78+
"CREATE TAG IF NOT EXISTS movie(name string);\n",
79+
"CREATE TAG IF NOT EXISTS person(name string, birthdate string);\n",
80+
"CREATE EDGE IF NOT EXISTS acted_in();\n",
81+
"CREATE TAG INDEX IF NOT EXISTS person_index ON person(name(128));\n",
82+
"CREATE TAG INDEX IF NOT EXISTS movie_index ON movie(name(128));"
83+
]
84+
},
85+
{
86+
"attachments": {},
87+
"cell_type": "markdown",
88+
"id": "66e4799a",
89+
"metadata": {},
90+
"source": [
91+
"Wait for schema creation to complete, then we can insert some data."
92+
]
93+
},
94+
{
95+
"cell_type": "code",
96+
"execution_count": 1,
97+
"id": "d8eea530",
98+
"metadata": {},
99+
"outputs": [
100+
{
101+
"name": "stderr",
102+
"output_type": "stream",
103+
"text": [
104+
"UsageError: Cell magic `%%ngql` not found.\n"
105+
]
106+
}
107+
],
108+
"source": [
109+
"%%ngql\n",
110+
"INSERT VERTEX person(name, birthdate) VALUES \"Al Pacino\":(\"Al Pacino\", \"1940-04-25\");\n",
111+
"INSERT VERTEX movie(name) VALUES \"The Godfather II\":(\"The Godfather II\");\n",
112+
"INSERT VERTEX movie(name) VALUES \"The Godfather Coda: The Death of Michael Corleone\":(\"The Godfather Coda: The Death of Michael Corleone\");\n",
113+
"INSERT EDGE acted_in() VALUES \"Al Pacino\"->\"The Godfather II\":();\n",
114+
"INSERT EDGE acted_in() VALUES \"Al Pacino\"->\"The Godfather Coda: The Death of Michael Corleone\":();"
115+
]
116+
},
117+
{
118+
"cell_type": "code",
119+
"execution_count": 1,
120+
"id": "62812aad",
121+
"metadata": {},
122+
"outputs": [],
123+
"source": [
124+
"from langchain.chat_models import ChatOpenAI\n",
125+
"from langchain.chains import NebulaGraphQAChain\n",
126+
"from langchain.graphs import NebulaGraph"
127+
]
128+
},
129+
{
130+
"cell_type": "code",
131+
"execution_count": 2,
132+
"id": "0928915d",
133+
"metadata": {},
134+
"outputs": [],
135+
"source": [
136+
"graph = NebulaGraph(\n",
137+
" space=\"langchain\",\n",
138+
" username=\"root\",\n",
139+
" password=\"nebula\",\n",
140+
" address=\"127.0.0.1\",\n",
141+
" port=9669,\n",
142+
" session_pool_size=30,\n",
143+
")"
144+
]
145+
},
146+
{
147+
"attachments": {},
148+
"cell_type": "markdown",
149+
"id": "58c1a8ea",
150+
"metadata": {},
151+
"source": [
152+
"## Refresh graph schema information\n",
153+
"\n",
154+
"If the schema of database changes, you can refresh the schema information needed to generate nGQL statements."
155+
]
156+
},
157+
{
158+
"cell_type": "code",
159+
"execution_count": null,
160+
"id": "4e3de44f",
161+
"metadata": {},
162+
"outputs": [],
163+
"source": [
164+
"# graph.refresh_schema()"
165+
]
166+
},
167+
{
168+
"cell_type": "code",
169+
"execution_count": 3,
170+
"id": "1fe76ccd",
171+
"metadata": {},
172+
"outputs": [
173+
{
174+
"name": "stdout",
175+
"output_type": "stream",
176+
"text": [
177+
"Node properties: [{'tag': 'movie', 'properties': [('name', 'string')]}, {'tag': 'person', 'properties': [('name', 'string'), ('birthdate', 'string')]}]\n",
178+
"Edge properties: [{'edge': 'acted_in', 'properties': []}]\n",
179+
"Relationships: ['(:person)-[:acted_in]->(:movie)']\n",
180+
"\n"
181+
]
182+
}
183+
],
184+
"source": [
185+
"print(graph.get_schema)"
186+
]
187+
},
188+
{
189+
"attachments": {},
190+
"cell_type": "markdown",
191+
"id": "68a3c677",
192+
"metadata": {},
193+
"source": [
194+
"## Querying the graph\n",
195+
"\n",
196+
"We can now use the graph cypher QA chain to ask question of the graph"
197+
]
198+
},
199+
{
200+
"cell_type": "code",
201+
"execution_count": 5,
202+
"id": "7476ce98",
203+
"metadata": {},
204+
"outputs": [],
205+
"source": [
206+
"chain = NebulaGraphQAChain.from_llm(\n",
207+
" ChatOpenAI(temperature=0), graph=graph, verbose=True\n",
208+
")\n"
209+
]
210+
},
211+
{
212+
"cell_type": "code",
213+
"execution_count": 6,
214+
"id": "ef8ee27b",
215+
"metadata": {},
216+
"outputs": [
217+
{
218+
"name": "stdout",
219+
"output_type": "stream",
220+
"text": [
221+
"\n",
222+
"\n",
223+
"\u001b[1m> Entering new NebulaGraphQAChain chain...\u001b[0m\n",
224+
"Generated nGQL:\n",
225+
"\u001b[32;1m\u001b[1;3mMATCH (p:`person`)-[:acted_in]->(m:`movie`) WHERE m.`movie`.`name` == 'The Godfather II'\n",
226+
"RETURN p.`person`.`name`\u001b[0m\n",
227+
"Full Context:\n",
228+
"\u001b[32;1m\u001b[1;3m{'p.person.name': ['Al Pacino']}\u001b[0m\n",
229+
"\n",
230+
"\u001b[1m> Finished chain.\u001b[0m\n"
231+
]
232+
},
233+
{
234+
"data": {
235+
"text/plain": [
236+
"'Al Pacino played in The Godfather II.'"
237+
]
238+
},
239+
"execution_count": 6,
240+
"metadata": {},
241+
"output_type": "execute_result"
242+
}
243+
],
244+
"source": [
245+
"chain.run(\"Who played in The Godfather II?\")"
246+
]
247+
}
248+
],
249+
"metadata": {
250+
"kernelspec": {
251+
"display_name": "Python 3 (ipykernel)",
252+
"language": "python",
253+
"name": "python3"
254+
},
255+
"language_info": {
256+
"codemirror_mode": {
257+
"name": "ipython",
258+
"version": 3
259+
},
260+
"file_extension": ".py",
261+
"mimetype": "text/x-python",
262+
"name": "python",
263+
"nbconvert_exporter": "python",
264+
"pygments_lexer": "ipython3",
265+
"version": "3.11.3"
266+
}
267+
},
268+
"nbformat": 4,
269+
"nbformat_minor": 5
270+
}

langchain/chains/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
from langchain.chains.flare.base import FlareChain
1212
from langchain.chains.graph_qa.base import GraphQAChain
1313
from langchain.chains.graph_qa.cypher import GraphCypherQAChain
14+
from langchain.chains.graph_qa.nebulagraph import NebulaGraphQAChain
1415
from langchain.chains.hyde.base import HypotheticalDocumentEmbedder
1516
from langchain.chains.llm import LLMChain
1617
from langchain.chains.llm_bash.base import LLMBashChain
@@ -67,4 +68,5 @@
6768
"ConversationalRetrievalChain",
6869
"OpenAPIEndpointChain",
6970
"FlareChain",
71+
"NebulaGraphQAChain",
7072
]
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
"""Question answering over a graph."""
2+
from __future__ import annotations
3+
4+
from typing import Any, Dict, List, Optional
5+
6+
from pydantic import Field
7+
8+
from langchain.base_language import BaseLanguageModel
9+
from langchain.callbacks.manager import CallbackManagerForChainRun
10+
from langchain.chains.base import Chain
11+
from langchain.chains.graph_qa.prompts import CYPHER_QA_PROMPT, NGQL_GENERATION_PROMPT
12+
from langchain.chains.llm import LLMChain
13+
from langchain.graphs.nebula_graph import NebulaGraph
14+
from langchain.prompts.base import BasePromptTemplate
15+
16+
17+
class NebulaGraphQAChain(Chain):
18+
"""Chain for question-answering against a graph by generating nGQL statements."""
19+
20+
graph: NebulaGraph = Field(exclude=True)
21+
ngql_generation_chain: LLMChain
22+
qa_chain: LLMChain
23+
input_key: str = "query" #: :meta private:
24+
output_key: str = "result" #: :meta private:
25+
26+
@property
27+
def input_keys(self) -> List[str]:
28+
"""Return the input keys.
29+
30+
:meta private:
31+
"""
32+
return [self.input_key]
33+
34+
@property
35+
def output_keys(self) -> List[str]:
36+
"""Return the output keys.
37+
38+
:meta private:
39+
"""
40+
_output_keys = [self.output_key]
41+
return _output_keys
42+
43+
@classmethod
44+
def from_llm(
45+
cls,
46+
llm: BaseLanguageModel,
47+
*,
48+
qa_prompt: BasePromptTemplate = CYPHER_QA_PROMPT,
49+
ngql_prompt: BasePromptTemplate = NGQL_GENERATION_PROMPT,
50+
**kwargs: Any,
51+
) -> NebulaGraphQAChain:
52+
"""Initialize from LLM."""
53+
qa_chain = LLMChain(llm=llm, prompt=qa_prompt)
54+
ngql_generation_chain = LLMChain(llm=llm, prompt=ngql_prompt)
55+
56+
return cls(
57+
qa_chain=qa_chain,
58+
ngql_generation_chain=ngql_generation_chain,
59+
**kwargs,
60+
)
61+
62+
def _call(
63+
self,
64+
inputs: Dict[str, Any],
65+
run_manager: Optional[CallbackManagerForChainRun] = None,
66+
) -> Dict[str, str]:
67+
"""Generate nGQL statement, use it to look up in db and answer question."""
68+
_run_manager = run_manager or CallbackManagerForChainRun.get_noop_manager()
69+
callbacks = _run_manager.get_child()
70+
question = inputs[self.input_key]
71+
72+
generated_ngql = self.ngql_generation_chain.run(
73+
{"question": question, "schema": self.graph.get_schema}, callbacks=callbacks
74+
)
75+
76+
_run_manager.on_text("Generated nGQL:", end="\n", verbose=self.verbose)
77+
_run_manager.on_text(
78+
generated_ngql, color="green", end="\n", verbose=self.verbose
79+
)
80+
context = self.graph.query(generated_ngql)
81+
82+
_run_manager.on_text("Full Context:", end="\n", verbose=self.verbose)
83+
_run_manager.on_text(
84+
str(context), color="green", end="\n", verbose=self.verbose
85+
)
86+
87+
result = self.qa_chain(
88+
{"question": question, "context": context},
89+
callbacks=callbacks,
90+
)
91+
return {self.output_key: result[self.qa_chain.output_key]}

0 commit comments

Comments
 (0)