Skip to content

(EAI-989): Refactor verified answers to wrap GenerateResponse #688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 28, 2025

Conversation

mongodben
Copy link
Collaborator

Jira: https://jira.mongodb.org/browse/EAI-989

Changes

  • Refactor verified answers to wrap GenerateResponse
  • Add this to our server

Notes

@mongodben mongodben changed the base branch from main to EAI-988 April 28, 2025 18:02
@mongodben mongodben merged commit a68391f into EAI-988 Apr 28, 2025
1 check failed
@mongodben mongodben deleted the EAI-989 branch April 28, 2025 19:42
mongodben added a commit that referenced this pull request May 27, 2025
* refactor GenerateRespose

* Clean up imports

* consolidate generate user prompt to the legacy file

* update test config imports

* Fix broken tests

* (EAI-989): Refactor verified answers to wrap `GenerateResponse` (#688)

verified answer generate response

Co-authored-by: Ben Perlmutter <[email protected]>

* handle streaming

* separate generateresponse

* typo fix

---------

Co-authored-by: Ben Perlmutter <[email protected]>
mongodben added a commit that referenced this pull request Jun 11, 2025
* (EAI-988): Refactor `GenerateResponse` for tool call support  (#687)

* refactor GenerateRespose

* Clean up imports

* consolidate generate user prompt to the legacy file

* update test config imports

* Fix broken tests

* (EAI-989): Refactor verified answers to wrap `GenerateResponse` (#688)

verified answer generate response

Co-authored-by: Ben Perlmutter <[email protected]>

* handle streaming

* separate generateresponse

* typo fix

---------

Co-authored-by: Ben Perlmutter <[email protected]>

* (EAI-990): Refactor search as a tool  (#705)

* refactor GenerateRespose

* Clean up imports

* consolidate generate user prompt to the legacy file

* update test config imports

* Fix broken tests

* get started

* nominally working generate res w/ search

* small refactors

* aint pretty but fully functional

* hacky if more functional

* more hack

* tools

* functional if not pretty

* Add processing

* working tool calling

* making progress

* keepin on

* Clean config

* working e2e

* update model version

* Remove no longer used stuff

* decouple search results for references and whats shown to model

* fix scripts build errs

* fix broken tests

* update default ref links

* fix broken tests

* Apply suggestions from code review

Co-authored-by: Nick Larew <[email protected]>

* revert default reference links

* adding missing test

---------

Co-authored-by: Ben Perlmutter <[email protected]>
Co-authored-by: Nick Larew <[email protected]>

* (EAI-992): Remove `ChatLlm` (#751)

* refactor GenerateRespose

* Clean up imports

* consolidate generate user prompt to the legacy file

* update test config imports

* Fix broken tests

* get started

* nominally working generate res w/ search

* small refactors

* aint pretty but fully functional

* hacky if more functional

* more hack

* tools

* functional if not pretty

* Add processing

* working tool calling

* making progress

* keepin on

* Clean config

* working e2e

* update model version

* Remove no longer used stuff

* decouple search results for references and whats shown to model

* fix scripts build errs

* remove ChatLlm

* lite fixes

* Remove stub

---------

Co-authored-by: Ben Perlmutter <[email protected]>

* (EAI-993): deprecate framework (#752)

* refactor GenerateRespose

* Clean up imports

* consolidate generate user prompt to the legacy file

* update test config imports

* Fix broken tests

* get started

* nominally working generate res w/ search

* small refactors

* aint pretty but fully functional

* hacky if more functional

* more hack

* tools

* functional if not pretty

* Add processing

* working tool calling

* making progress

* keepin on

* Clean config

* working e2e

* update model version

* Remove no longer used stuff

* decouple search results for references and whats shown to model

* fix scripts build errs

* fix broken tests

* deprecation

* build out docs following last week convo

* clean up spec + contact

* fix merge funk

* docs updates

---------

Co-authored-by: Ben Perlmutter <[email protected]>

* (EAI-1071): Fix broken Atlas OpenAPI ingest (#765)

* update to fix broken test

* Update packages/ingest-mongodb-public/src/sources/snooty/snootyAstToOpenApiSpec.ts

---------

Co-authored-by: Ben Perlmutter <[email protected]>

* (EAI-995): add guardrail (#755)

* refactor GenerateRespose

* Clean up imports

* consolidate generate user prompt to the legacy file

* update test config imports

* Fix broken tests

* get started

* nominally working generate res w/ search

* small refactors

* aint pretty but fully functional

* hacky if more functional

* more hack

* tools

* functional if not pretty

* Add processing

* working tool calling

* making progress

* keepin on

* Clean config

* working e2e

* update model version

* Remove no longer used stuff

* decouple search results for references and whats shown to model

* fix scripts build errs

* fix broken tests

* update default ref links

* fix broken tests

* input guardrail refactor

* guardrail works well

* simpler validity metric

* add guardrail to server

* add next step todo

* llm refusal msg

* remove TODO comment

* merge fix

* fix unnec changes

* NL feedback

---------

Co-authored-by: Ben Perlmutter <[email protected]>

* fix type in text

* (EAI-991 & EAI-1050): Evaluate and clean up retrieval as a tool (#757)

* refactor GenerateRespose

* Clean up imports

* consolidate generate user prompt to the legacy file

* update test config imports

* Fix broken tests

* get started

* nominally working generate res w/ search

* small refactors

* aint pretty but fully functional

* hacky if more functional

* more hack

* tools

* functional if not pretty

* Add processing

* working tool calling

* making progress

* keepin on

* Clean config

* working e2e

* update model version

* Remove no longer used stuff

* decouple search results for references and whats shown to model

* fix scripts build errs

* fix broken tests

* update default ref links

* fix broken tests

* input guardrail refactor

* guardrail works well

* simpler validity metric

* add guardrail to server

* add next step todo

* llm refusal msg

* remove TODO comment

* evals on new architecture

* Get urls in a way that supports verified answers

* dont eval on retrieved elems if no context

* Cleaner handling

* update trace handling

* update trace handling

* undo git funk

* handle undefined case

* Fix tracing test

---------

Co-authored-by: Ben Perlmutter <[email protected]>

* remove console logs + redunancies

---------

Co-authored-by: Ben Perlmutter <[email protected]>
Co-authored-by: Nick Larew <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants