Skip to content

Commit fc9e23f

Browse files
authored
chore: store golden query results in files (#819)
Moves the "golden query" results from being inline in the code to being in separate files, and introduces a workflow to overwrite the files, namely: OVERWRITE_GOLDEN=true just ci This makes it easier to maintain these golden tests when things change.
1 parent c00d6aa commit fc9e23f

12 files changed

+147
-91
lines changed

projects/pgai/db/justfile

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
export PROJECT_JUSTFILE := "1" # Note: used in build.py
22
PG_MAJOR := env("PG_MAJOR", "17")
33
PG_BIN := env("PG_BIN", "/usr/lib/postgresql/" + PG_MAJOR + "/bin")
4+
OVERWRITE_GOLDEN := env("OVERWRITE_GOLDEN", "false")
45

56
# Show list of recipes
67
default:
@@ -14,7 +15,7 @@ ci: docker-build docker-run docker-sync
1415
docker exec pgai-db just build
1516
docker exec pgai-db just lint
1617
docker exec -d pgai-db just test-server
17-
docker exec pgai-db just test
18+
docker exec -e OVERWRITE_GOLDEN={{OVERWRITE_GOLDEN}} pgai-db just test
1819

1920
clean:
2021
@./build.py clean
@@ -30,7 +31,7 @@ test-server:
3031

3132
test:
3233
@./build.py test
33-
34+
3435
lint:
3536
@./build.py lint
3637

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
Table "ai._vectorizer_q_1"
2+
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
3+
---------------------+--------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
4+
title | text | | not null | | extended | | |
5+
published | timestamp with time zone | | not null | | plain | | |
6+
queued_at | timestamp with time zone | | not null | now() | plain | | |
7+
loading_retries | integer | | not null | 0 | plain | | |
8+
loading_retry_after | timestamp with time zone | | | | plain | | |
9+
Indexes:
10+
"_vectorizer_q_1_title_published_idx" btree (title, published)
11+
Access method: heap
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
Table "ai._vectorizer_q_1"
2+
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
3+
---------------------+--------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
4+
title | text | | not null | | extended | | |
5+
published | timestamp with time zone | | not null | | plain | | |
6+
queued_at | timestamp with time zone | | not null | now() | plain | | |
7+
loading_retries | integer | | not null | 0 | plain | | |
8+
loading_retry_after | timestamp with time zone | | | | plain | | |
9+
Indexes:
10+
"_vectorizer_q_1_title_published_idx" btree (title, published)
11+
Access method: heap
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
Table "website.blog"
2+
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
3+
-----------+--------------------------+-----------+----------+------------------------------+----------+-------------+--------------+-------------
4+
id | integer | | not null | generated always as identity | plain | | |
5+
title | text | | not null | | extended | | |
6+
published | timestamp with time zone | | not null | | plain | | |
7+
body | text | | not null | | extended | | |
8+
Indexes:
9+
"blog_pkey" PRIMARY KEY, btree (title, published)
10+
Triggers:
11+
_vectorizer_src_trg_1 AFTER INSERT OR DELETE OR UPDATE ON website.blog FOR EACH ROW EXECUTE FUNCTION ai._vectorizer_src_trg_1()
12+
_vectorizer_src_trg_1_truncate AFTER TRUNCATE ON website.blog FOR EACH STATEMENT EXECUTE FUNCTION ai._vectorizer_src_trg_1()
13+
Access method: heap
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
Table "website.blog"
2+
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
3+
-----------+--------------------------+-----------+----------+------------------------------+----------+-------------+--------------+-------------
4+
id | integer | | not null | generated always as identity | plain | | |
5+
title | text | | not null | | extended | | |
6+
published | timestamp with time zone | | not null | | plain | | |
7+
body | text | | not null | | extended | | |
8+
Indexes:
9+
"blog_pkey" PRIMARY KEY, btree (title, published)
10+
Triggers:
11+
_vectorizer_src_trg_1 AFTER INSERT OR DELETE OR UPDATE ON website.blog FOR EACH ROW EXECUTE FUNCTION ai._vectorizer_src_trg_1()
12+
_vectorizer_src_trg_1_truncate AFTER TRUNCATE ON website.blog FOR EACH STATEMENT EXECUTE FUNCTION ai._vectorizer_src_trg_1()
13+
Access method: heap
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
List of functions
2+
Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Internal name | Description
3+
--------+-----------------------+------------------+---------------------+------+------------+----------+-------+----------+-------------------+----------+---------------+-------------
4+
ai | _vectorizer_src_trg_1 | trigger | | func | volatile | safe | test | definer | test=X/test | plpgsql | |
5+
(1 row)
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
List of functions
2+
Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Internal name | Description
3+
--------+-----------------------+------------------+---------------------+------+------------+----------+-------+----------+-------------------+----------+---------------+-------------
4+
ai | _vectorizer_src_trg_1 | trigger | | func | volatile | safe | test | definer | test=X/test | plpgsql | |
5+
(1 row)
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
Table "website.blog_embedding_store"
2+
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
3+
----------------+--------------------------+-----------+----------+-------------------+----------+-------------+--------------+-------------
4+
embedding_uuid | uuid | | not null | gen_random_uuid() | plain | | |
5+
title | text | | not null | | extended | | |
6+
published | timestamp with time zone | | not null | | plain | | |
7+
chunk_seq | integer | | not null | | plain | | |
8+
chunk | text | | not null | | extended | | |
9+
embedding | vector(768) | | not null | | main | | |
10+
Indexes:
11+
"blog_embedding_store_pkey" PRIMARY KEY, btree (embedding_uuid)
12+
"blog_embedding_store_title_published_chunk_seq_key" UNIQUE CONSTRAINT, btree (title, published, chunk_seq)
13+
Access method: heap
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
Table "website.blog_embedding_store"
2+
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
3+
----------------+--------------------------+-----------+----------+-------------------+----------+-------------+--------------+-------------
4+
embedding_uuid | uuid | | not null | gen_random_uuid() | plain | | |
5+
title | text | | not null | | extended | | |
6+
published | timestamp with time zone | | not null | | plain | | |
7+
chunk_seq | integer | | not null | | plain | | |
8+
chunk | text | | not null | | extended | | |
9+
embedding | vector(768) | | not null | | main | | |
10+
Indexes:
11+
"blog_embedding_store_pkey" PRIMARY KEY, btree (embedding_uuid)
12+
"blog_embedding_store_title_published_chunk_seq_key" UNIQUE CONSTRAINT, btree (title, published, chunk_seq)
13+
Access method: heap
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
View "website.blog_embedding"
2+
Column | Type | Collation | Nullable | Default | Storage | Description
3+
----------------+--------------------------+-----------+----------+---------+----------+-------------
4+
embedding_uuid | uuid | | | | plain |
5+
chunk_seq | integer | | | | plain |
6+
chunk | text | | | | extended |
7+
embedding | vector(768) | | | | external |
8+
id | integer | | | | plain |
9+
title | text | | | | extended |
10+
published | timestamp with time zone | | | | plain |
11+
body | text | | | | extended |
12+
View definition:
13+
SELECT t.embedding_uuid,
14+
t.chunk_seq,
15+
t.chunk,
16+
t.embedding,
17+
s.id,
18+
t.title,
19+
t.published,
20+
s.body
21+
FROM website.blog_embedding_store t
22+
LEFT JOIN website.blog s ON t.title = s.title AND t.published = s.published;

0 commit comments

Comments
 (0)