feat: add optional message embedding for semantic search (config-driven)#18059
Open
gzsiang wants to merge 2 commits intoNousResearch:mainfrom
Open
feat: add optional message embedding for semantic search (config-driven)#18059gzsiang wants to merge 2 commits intoNousResearch:mainfrom
gzsiang wants to merge 2 commits intoNousResearch:mainfrom
Conversation
trevorgordon981
approved these changes
Apr 30, 2026
trevorgordon981
left a comment
There was a problem hiding this comment.
LGTM. This is a clean, opt-in addition for semantic search over conversation history. The implementation is thoughtful:
- Schema v12 adds
message_embedding BLOBcolumn (backward-compatible via existing reconciliation) - Config-driven:
embedding.base_url/endpointin~/.hermes/config.yaml— disabled by default (zero overhead) - Hybrid search: FTS5 BM25 + cosine similarity re-ranking (0.3/0.7 weights), with pure vector fallback
- Graceful:
_compute_embeddingshort-circuits ifbase_urlis unset — no DNS calls
Verified:
- Schema migration works (v12 columns present)
- Append with embedding disabled: PASS
- 195/198 state tests pass — 3 failures are expected (schema version assertions need updating from 11→12)
The PR is ready once the 3 test assertions are bumped to expect v12. Nice work.
Author
|
Thanks for the review! Updated the 3 schema version test assertions to expect v12. |
afac1cd to
8d5031c
Compare
Add support for computing and storing embedding vectors (packed float32)
for assistant messages, enabling cosine-similarity-based hybrid search.
- Schema v12: add message_embedding BLOB column (auto-reconciled)
- _try_load_embedding_config: load endpoint from ~/.hermes/config.yaml
- _compute_embedding: call Ollama-compatible /v1/embeddings API
- _cosine_similarity: re-rank FTS5 results using vector similarity
- Graceful fallback: no embedding configured = pure FTS5, no overhead
- Vector-only search when FTS5 returns no results
Usage: embed embedding.base_url in ~/.hermes/config.yaml:
embedding:
base_url: http://your-server:8081
model: Qwen3-Embedding-0.6B
dimension: 1024
8d5031c to
a4cfd7f
Compare
gzsiang
added a commit
to gzsiang/hermes-agent
that referenced
this pull request
May 4, 2026
Added Chinese description of fork features: - Circuit breaker (NousResearch#16749) - CLI Chinese localization (NousResearch#15282) - Message embedding (NousResearch#18059) - Emergency compression (NousResearch#18607)
Author
|
Friendly ping — this PR has been approved and the requested changes addressed. Happy to rebase or make any additional tweaks if needed. Would love to get this merged when you have a chance! 🙏 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Store embedding vectors alongside each assistant message to enable future semantic search over conversation history. The feature is fully opt-in — no embedding endpoint configured = zero overhead.
Changes
Only one file modified:
hermes_state.py(177 insertions, 8 deletions).message_embedding BLOBcolumn to themessagestable (auto-reconciled by existing column reconciliation)embedding.base_url/embedding.endpointfrom~/.hermes/config.yaml— defaults to None (disabled)append_message()computes and stores embedding vectors (packed float32) for assistant messages when configuredsearch_messages()re-ranks FTS5 results using cosine similarity when embeddings are available; falls back to pure vector search when FTS returns nothingembedding.base_urlis not set,_compute_embeddingreturns None immediately — no DNS calls, no overheadHow to use
Notes