Skip to content

Conversation

yunfeng-scale
Copy link
Contributor

@yunfeng-scale yunfeng-scale commented Oct 10, 2023

Validate quantization values when creating endpoints

@@ -73,5 +74,5 @@ def get_boolean_env_var(name: str) -> bool:
logger.warning("LOCAL development & testing mode is ON")

GIT_TAG: str = os.environ.get("GIT_TAG", "GIT_TAG_NOT_FOUND")
if GIT_TAG == "GIT_TAG_NOT_FOUND":
if GIT_TAG == "GIT_TAG_NOT_FOUND" and "pytest" not in sys.modules:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make pytest work without specifying GIT_TAG

LLMInferenceFramework.DEEPSPEED: [],
LLMInferenceFramework.TEXT_GENERATION_INFERENCE: [Quantization.BITSANDBYTES],
LLMInferenceFramework.VLLM: [Quantization.AWQ],
LLMInferenceFramework.LIGHTLLM: [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably best for a separate pr, but can you update the docs to specify which models in the model zoo support lightllm as inference framework?

)
if num_shards > gpus:
raise ObjectHasInvalidValueException(
f"Num shard {num_shards} must be less than or equal to the number of GPUs {gpus}."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could mention the inference framework in the error msg

@yunfeng-scale yunfeng-scale enabled auto-merge (squash) October 11, 2023 19:15
@yunfeng-scale yunfeng-scale merged commit 60ac144 into main Oct 12, 2023
@yunfeng-scale yunfeng-scale deleted the yunfeng-validate-quantization branch October 12, 2023 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants