Skip to content

Conversation

squeakymouse
Copy link
Contributor

No description provided.

@@ -507,6 +507,7 @@ def get_endpoint_resource_arguments_from_request(
main_env = []
if isinstance(flavor, RunnableImageLike) and flavor.env:
main_env = [{"name": key, "value": value} for key, value in flavor.env.items()]
main_env.append({"name": "AWS_PROFILE", "value": build_endpoint_request.aws_role})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we put a test for this (to test various merging logic with user-provided aws profiles)

@@ -10,7 +10,7 @@

def start_server():
parser = argparse.ArgumentParser()
parser.add_argument("--graceful-timeout", type=int, default=600)
parser.add_argument("--graceful-timeout", type=int, default=1800)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what's being used in async tasks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we are going to patch llm-engine/model-engine/model_engine_server/inference/async_inference/celery.py to listen to sigterm?

Copy link
Contributor

@seanshi-scale seanshi-scale Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah iirc this code is used to serve the user container for both sync and async tasks for artifact-like bundles, and async_inference/celery.py shouldn't be used anymore for the user containers at least (not sure about celery-forwarder though)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd also have to patch celery-forwarder to listen to sigterm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what's being used in async tasks?

This is what Frances's async endpoint uses

we'd also have to patch celery-forwarder to listen to sigterm

From the celery documentation: "When shutdown is initiated the worker will finish all currently executing tasks before it actually terminates" which I think means we're good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible to simulate such an scenario by sending traffic while restarting a pod?

@squeakymouse squeakymouse merged commit 4817ffc into main Oct 3, 2023
@squeakymouse squeakymouse deleted the katiewu/increase-graceful-timeout-and-aws-profile branch October 3, 2023 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants