Skip to content

Optimize PostgreSQL initialization performance#2237

Merged
danielaskdd merged 1 commit intoHKUDS:mainfrom
yrangana:feat/optimize-postgres-initialization
Oct 21, 2025
Merged

Optimize PostgreSQL initialization performance#2237
danielaskdd merged 1 commit intoHKUDS:mainfrom
yrangana:feat/optimize-postgres-initialization

Conversation

@yrangana
Copy link
Contributor

Description

Optimizes PostgreSQL storage initialization by batching schema inspection queries. This reduces database round-trips by consolidating multiple information_schema queries into single batched queries using PostgreSQL's ANY($1) array syntax.

Related Issues

Addresses performance bottleneck during PostgreSQL storage initialization.

Changes Made

  1. check_tables() - Batched Index Checking

    • Consolidates index existence checks into a single query using ANY($1)
    • Reduces 16+ separate queries to 1 batched query
  2. _migrate_timestamp_columns() - Batched Column Type Checking

    • Consolidates timestamp column checks into a single query
    • Reduces 8 separate queries to 1 batched query
  3. _migrate_field_lengths() - Batched Field Definition Checking

    • Consolidates field length checks into a single query
    • Reduces 5 separate queries to 1 batched query

Backward Compatibility

  • No schema changes
  • No API changes
  • All error handling preserved
  • All logging preserved

Checklist

  • Changes tested locally
  • Code reviewed
  • Documentation updated (if necessary)
  • Unit tests added (if applicable)

Additional Notes

Technical Implementation

  • Uses PostgreSQL's ANY($1) syntax for array-based filtering
  • Builds in-memory lookup dictionaries after single query
  • Maintains exact same logic flow and behavior
  • Passes ruff linting with no issues

@yrangana yrangana force-pushed the feat/optimize-postgres-initialization branch 2 times, most recently from baa1939 to 410c430 Compare October 20, 2025 14:08
- Batch index existence checks into single query (16+ queries -> 1 query)
- Batch timestamp column checks into single query (8 queries -> 1 query)
- Batch field length checks into single query (5 queries -> 1 query)

Performance improvement: ~70-80% faster initialization (35s -> 5-10s)

Key optimizations:
1. check_tables(): Use ANY($1) to check all indexes at once
2. _migrate_timestamp_columns(): Batch all column type checks
3. _migrate_field_lengths(): Batch all field definition checks

All changes are backward compatible with no schema or API changes.
Reduces database round-trips by batching information_schema queries.
@yrangana yrangana force-pushed the feat/optimize-postgres-initialization branch from 410c430 to 2f22336 Compare October 20, 2025 14:09
@danielaskdd
Copy link
Collaborator

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@danielaskdd
Copy link
Collaborator

Thanks for sharing.

@danielaskdd danielaskdd merged commit 9072047 into HKUDS:main Oct 21, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants