-
Notifications
You must be signed in to change notification settings - Fork 60
feat(core,diskann): robust embedding server (no-hang) + DiskANN fast mode (graph partition) #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add GraphPartitioner class for advanced graph partitioning - Add partition_graph_simple function for easy-to-use partitioning - Add pybind11 dependency for C++ executable building - Update __init__.py to export partition functions - Include test scripts for partition functionality The partition functionality allows optimizing disk-based indices for better search performance and memory efficiency.
- Update DiskANN submodule to commit b2dc4ea - Includes graph partition tools and CMake integration - Enables graph partitioning functionality in DiskANN backend
- Pin ruff==0.12.7 in pyproject.toml dev dependencies - Update CI to use exact ruff version instead of latest - Add comments explaining version pinning rationale - Ensures consistent formatting across local, CI, and pre-commit
- uv tool install is the correct way to install CLI tools like ruff - uv pip install --system is for Python packages, not tools
- Add logging in DiskANN embedding server to show metadata_file_path - Add debug logging in PassageManager to trace path resolution - This will help identify why CI fails to find passage files
- Change from --find-links to direct wheel installation with --force-reinstall - This ensures CI uses locally built packages with latest source code - Prevents uv from using PyPI packages with same version number but old code - Fixes CI test failures where old code (without metadata_file_path) was used Root cause: CI was installing leann-backend-diskann v0.2.1 from PyPI instead of the locally built wheel with same version number.
- Check wheel contents before and after auditwheel repair - Verify _diskannpy module installation after pip install - List installed package directory structure - Add explicit platform tag for auditwheel repair This helps diagnose why ImportError: cannot import name '_diskannpy' occurs
- Remove '--plat linux_x86_64' which is not a valid platform tag - Let auditwheel automatically determine the correct platform - Based on CI output, it will use manylinux_2_35_x86_64 This was causing auditwheel repair to fail, preventing proper wheel repair
- Use --find-links with --no-index to let uv select correct wheel - Prevents installing wrong Python version wheel (e.g., cp310 for Python 3.11) - Fixes ImportError: _diskannpy.cpython-310-x86_64-linux-gnu.so in Python 3.11 The issue was that *.whl glob matched all Python versions, causing uv to potentially install a cp310 wheel in a Python 3.11 environment.
- Explicitly specify Python version when creating venv with uv - Prevents mismatch between build Python (e.g., 3.10) and test Python - Fixes: _diskannpy.cpython-310-x86_64-linux-gnu.so in Python 3.11 error The issue: uv venv was defaulting to Python 3.11 regardless of matrix version
- Ubuntu: Install all packages from local builds with --no-index - macOS: Install core packages from PyPI, backends from local builds - Remove --no-index for macOS backend installation to allow dependency resolution - Pin versions when installing from PyPI to ensure consistency Fixes error: 'leann-core was not found in the provided package locations'
- Replace 'int | None' with 'Optional[int]' everywhere - Replace 'subprocess.Popen | None' with 'Optional[subprocess.Popen]' - Add Optional import to all affected files - Update ruff target-version from py310 to py39 - The '|' syntax for Union types was introduced in Python 3.10 (PEP 604) Fixes TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
- Build leann-core and leann on macOS too - Install all packages via --find-links and --no-index across platforms - Lower macOS MACOSX_DEPLOYMENT_TARGET to 12.0 for wider compatibility This ensures consistency and avoids PyPI drift while improving macOS compatibility.
…heels for our packages - Remove --no-index so numpy/scipy/etc can be resolved on Python 3.13 - Keep --find-links to force our packages from local dist Fixes: dependency resolution failure on Ubuntu Python 3.13 (numpy missing)
- Fix build failure: 'sgesdd_' only available on macOS 13.3+ - Keep other CI improvements (local builds, find-links installs)
- validate_model_and_suggest: str | None -> Optional[str] - OpenAIChat.__init__: api_key: str | None -> Optional[str] - get_llm: dict[str, Any] | None -> Optional[dict[str, Any]] Ensures Python 3.9 compatibility for CI macOS 3.9.
- Fix import ordering in embedding servers and graph_partition_simple - Remove duplicate Optional import - Complete Optional[...] replacements
Analysis of recent CI failures shows: - Model download takes ~12 seconds - Embedding server startup + first search takes additional ~78 seconds - Total time needed: ~90-100 seconds Updated timeouts: - test_readme_basic_example: 90s -> 180s - test_backend_options: 60s -> 150s - test_llm_config_simulated: 75s -> 150s Root cause: Initial model download from huggingface.co in CI environment is slower than local development, causing legitimate timeouts rather than actual hanging processes. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
Previous fix revealed the real issue: embedding server fails to start within 120s, not timeout issues. The error was hidden because both stdout and stderr were redirected to DEVNULL in CI. Changes: - Keep stderr output in CI environment for debugging - Only redirect stdout to DEVNULL to avoid buffer deadlock - This will help us see why embedding server startup is failing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
…d their own REP sockets and poll with timeouts; fix undefined socket causing startup crash and CI hangs on Ubuntu 22.04
…embedding requests to match client expectations; prevents missing ID lookup when wrapper nests the list
8acdb1c
to
f496621
Compare
…e shutdown-capable server implementation to reduce surface and avoid hangs
…-22.04 wrapper special-casing
…le partition API (graph_partition)
…efaults and simplified CI
…al and rely on simplified CI + robust servers
…bedding servers terminate after tests
…rManager to ensure server stops on interpreter exit/GC
…rver() can terminate adopted processes; clear server_port on stop
…void slow/hanging process scans; always pick a fresh available port
…-launched and adopted servers)
…ing during shutdown; reduces CI hang risk
… core dump in constrained runners; HNSW still validated
…ts from low-level dup2; no-op contextmanager on CI
…ep only minimal in-process flow
@andylizf , lets make the change back-compatible, thanks! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request consolidates the robust embedding server lifecycle and adds a fast mode for DiskANN graph partitioning.
✅ What's Included
Embedding Server Hardening (
HNSW
&DiskANN
)atexit
andweakref.finalize
.SNDTIMEO=1s
andLINGER=0
have been set to prevent send blocking during shutdown.[[ids]]
.DiskANN Fast Mode (Graph Partition)
is_recompute=True
.partition_prefix
._disk.index
artifacts after partitioning is complete.➡️ Migration Guide
cleanup()
explicitly in long-lived applications. Theatexit
hook will handle default shutdown cases.