Validate quantization #315

yunfeng-scale · 2023-10-10T05:05:47Z

Validate quantization values when creating endpoints

model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py

yunfeng-scale · 2023-10-10T23:11:25Z

model-engine/model_engine_server/common/env_vars.py

@@ -73,5 +74,5 @@ def get_boolean_env_var(name: str) -> bool:
    logger.warning("LOCAL development & testing mode is ON")

 GIT_TAG: str = os.environ.get("GIT_TAG", "GIT_TAG_NOT_FOUND")
-if GIT_TAG == "GIT_TAG_NOT_FOUND":
+if GIT_TAG == "GIT_TAG_NOT_FOUND" and "pytest" not in sys.modules:


make pytest work without specifying GIT_TAG

ian-scale · 2023-10-11T16:57:25Z

model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py

+    LLMInferenceFramework.DEEPSPEED: [],
+    LLMInferenceFramework.TEXT_GENERATION_INFERENCE: [Quantization.BITSANDBYTES],
+    LLMInferenceFramework.VLLM: [Quantization.AWQ],
+    LLMInferenceFramework.LIGHTLLM: [],


probably best for a separate pr, but can you update the docs to specify which models in the model zoo support lightllm as inference framework?

saiatmakuri · 2023-10-11T17:09:59Z

model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py

            )
+    if num_shards > gpus:
+        raise ObjectHasInvalidValueException(
+            f"Num shard {num_shards} must be less than or equal to the number of GPUs {gpus}."


nit: could mention the inference framework in the error msg

Validate quantization

241e0f8

yixu34 reviewed Oct 10, 2023

View reviewed changes

model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py Outdated Show resolved Hide resolved

model-engine/model_engine_server/domain/use_cases/llm_model_endpoint_use_cases.py Outdated Show resolved Hide resolved

comments

4e25e0e

yunfeng-scale commented Oct 10, 2023

View reviewed changes

ian-scale approved these changes Oct 11, 2023

View reviewed changes

saiatmakuri reviewed Oct 11, 2023

View reviewed changes

Merge branch 'main' into yunfeng-validate-quantization

3ff4216

yunfeng-scale enabled auto-merge (squash) October 11, 2023 19:15

yunfeng-scale added 2 commits October 11, 2023 15:05

Merge branch 'main' into yunfeng-validate-quantization

5589647

Merge branch 'main' into yunfeng-validate-quantization

9744c64

yunfeng-scale merged commit 60ac144 into main Oct 12, 2023

yunfeng-scale deleted the yunfeng-validate-quantization branch October 12, 2023 01:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Validate quantization #315

Validate quantization #315

Uh oh!

yunfeng-scale commented Oct 10, 2023 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

yunfeng-scale Oct 10, 2023

Uh oh!

ian-scale Oct 11, 2023

Uh oh!

saiatmakuri Oct 11, 2023

Uh oh!

Uh oh!

Validate quantization #315

Validate quantization #315

Uh oh!

Conversation

yunfeng-scale commented Oct 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yunfeng-scale Oct 10, 2023

Choose a reason for hiding this comment

Uh oh!

ian-scale Oct 11, 2023

Choose a reason for hiding this comment

Uh oh!

saiatmakuri Oct 11, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yunfeng-scale commented Oct 10, 2023 •

edited

Loading