feat: implement hash join by RichardKnop · Pull Request #156 · RichardKnop/minisql

RichardKnop · 2026-05-06T01:01:23Z

Nested loop join (what minisql does today, with index on the join column):

Complexity: O(N × log M) with an index, O(N × M) without
Best for: small tables, OR when there's a useful index on the inner table's join column
Memory: essentially O(1)

Hash join (equi-join only):

Complexity: O(N + M) — linear
Best for: large tables with no useful join-column index, where you'd otherwise pay O(N × M)
Memory: O(min(N, M)) — must materialise the smaller ("build") table into a hash map

So hash join is faster than unindexed nested loop on large tables, but it costs memory. The 64MB threshold doesn't mean "switch back to nested loop because it's better there" — it means "beyond this size, the build-side hash table may not fit in RAM, so in-memory hash join is no longer safe to use." A production DB would do grace hash join (spill to disk) instead; for minisql it's reasonable to just fall back to nested loop.

Corrected strategy for minisql:

Condition	Plan
Join column has an index	Indexed nested loop (current behaviour) │
│ No index, build side ≤ threshold	In-memory hash join — O(N+M) │
│ No index, build side > threshold	Nested loop sequential — O(N×M), slow but no memory risk │

The threshold protects memory, not because nested loop is algorithmically better at large scale — it's strictly worse without an index. So the plan should be inverted: use hash join for large-enough tables where a full scan is needed (making it worth the memory cost), and keep nested loop for small tables or indexed joins.

github-actions · 2026-05-06T01:02:09Z

Code Coverage

✅ Total: 70.2% (threshold: 70%)

Package	Coverage
`github.com/RichardKnop/minisql`	80.7%
`github.com/RichardKnop/minisql/e2e_tests`	[no
`github.com/RichardKnop/minisql/internal/minisql`	69.5%
`github.com/RichardKnop/minisql/internal/parser`	84.2%
`github.com/RichardKnop/minisql/pkg/bitwise`	100.0%
`github.com/RichardKnop/minisql/pkg/lrucache`	81.7%

feat: implement hash join

9acc0fb

RichardKnop self-assigned this May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement hash join#156

feat: implement hash join#156
RichardKnop wants to merge 1 commit intomainfrom
feat/hash-join

RichardKnop commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RichardKnop commented May 6, 2026

Uh oh!

github-actions Bot commented May 6, 2026

Code Coverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant