Add tool completion to batch inference #461

yunfeng-scale · 2024-03-01T05:04:11Z

Pull Request Summary

Partial open source of tool_completion
Adapt batch inference to run tool completion loop (generations -> tool run)
Add tool config to batch inference API

for notekeeping, some things to fix:

vLLM stop sequence does not cut token logprobs so there's extra stop sequence token logprobs
tool should not use a hardcoded tokenizer, should use whatever provided together with model weights
token log prob not captured for tool output

Test Plan and Usage Guide

some local test
tested running jobs in training cluster
added unit tests
vllm batch image deployed

Georgepu1

overall lgtm and it's nice that tool_completion currently mirrors the latest version, just a couple of nits. thanks for adding this feature in for batch complete calls, we'll add it into our inference pipeline

model-engine/model_engine_server/inference/batch_inference/generate_tool_sample_data.py

model-engine/model_engine_server/inference/batch_inference/sample_data_tool.json

model-engine/model_engine_server/inference/batch_inference/vllm_batch.py

model-engine/model_engine_server/common/dtos/llms.py

…llm-engine into yunfeng-tool-completion

saiatmakuri · 2024-03-07T22:19:19Z

model-engine/model_engine_server/inference/tool_completion/tools.py

+
+
+TOOL_MAP = {
+    Tools.CODE_EVALUATOR: CodeBlockEvaluator,


nit: move each tool to own file for future if there are more tools

saiatmakuri · 2024-03-07T22:23:00Z

model-engine/model_engine_server/inference/tool_completion/tools.py

+            self.evaluate = self.evaluate_code_in_docker
+        except docker.errors.DockerException:
+            # If docker is not available, use the python interpreter
+            self.evaluate = self.evaluate_code_using_exec


is this safe for the batch job? seems that potentially harmful generations could break this

(synced offline) known risk - for now since main use is internal, risk should be low and this is okay. will have follow-ups to mitigate issues

that's true. please take a look how we think about this problem: https://docs.google.com/document/d/1zKfhizd0FpwoupF_jYmWkz4naTxkInI0g-rS6xzNWso/edit#heading=h.2hej6pszhyrd

i think for batch jobs the risk is more manageable

yunfeng-scale added 2 commits February 28, 2024 15:59

test

7d019d5

Impl

35f4957

yunfeng-scale requested review from auag92, Georgepu1 and a team March 1, 2024 05:06

yunfeng-scale added 2 commits March 1, 2024 05:10

remove logging

43aa1d6

lints

f20ce4d

Georgepu1 approved these changes Mar 4, 2024

View reviewed changes

yunfeng-scale added 3 commits March 5, 2024 06:18

fix tests

32ad5ca

Merge remote-tracking branch 'origin/main' into yunfeng-tool-completion

20f54f0

fix

c07a6f6

yixu34 reviewed Mar 5, 2024

View reviewed changes

model-engine/model_engine_server/common/dtos/llms.py Outdated Show resolved Hide resolved

yunfeng-scale added 7 commits March 5, 2024 23:50

fix

96e0c66

try fix unit test

a0851a5

fix

5bda0e8

no cover

aec60ce

stop sequences

b158dbe

erge branch 'yunfeng-tool-completion' of https://github.com/scaleapi/…

ab59e1a

…llm-engine into yunfeng-tool-completion

Merge remote-tracking branch 'origin/main' into yunfeng-tool-completion

07ccb41

saiatmakuri approved these changes Mar 7, 2024

View reviewed changes

yunfeng-scale merged commit 0528b52 into main Mar 7, 2024

yunfeng-scale deleted the yunfeng-tool-completion branch March 7, 2024 22:40

yunfeng-scale mentioned this pull request Mar 8, 2024

Change back batch infer GPU util and add tool completion client changes #465

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tool completion to batch inference #461

Add tool completion to batch inference #461

Uh oh!

yunfeng-scale commented Mar 1, 2024 •

edited

Loading

Uh oh!

Georgepu1 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saiatmakuri Mar 7, 2024

Uh oh!

saiatmakuri Mar 7, 2024

Uh oh!

saiatmakuri Mar 7, 2024

Uh oh!

yunfeng-scale Mar 7, 2024

Uh oh!

yunfeng-scale Mar 7, 2024

Uh oh!

Uh oh!

Add tool completion to batch inference #461

Add tool completion to batch inference #461

Uh oh!

Conversation

yunfeng-scale commented Mar 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Summary

Test Plan and Usage Guide

Uh oh!

Georgepu1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saiatmakuri Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

saiatmakuri Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

saiatmakuri Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yunfeng-scale commented Mar 1, 2024 •

edited

Loading