Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,11 @@ data:
spec:
affinity:
{{- include "modelEngine.serviceTemplateAffinity" . | nindent 12 }}
{{- if eq $mode "async" }}
terminationGracePeriodSeconds: 1800
{{- else }}
terminationGracePeriodSeconds: 600
{{- end }}
{{- if $service_template_service_account_name }}
serviceAccount: {{ $service_template_service_account_name }}
{{- else }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

def start_server():
parser = argparse.ArgumentParser()
parser.add_argument("--graceful-timeout", type=int, default=600)
parser.add_argument("--graceful-timeout", type=int, default=1800)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what's being used in async tasks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we are going to patch llm-engine/model-engine/model_engine_server/inference/async_inference/celery.py to listen to sigterm?

Copy link
Contributor

@seanshi-scale seanshi-scale Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah iirc this code is used to serve the user container for both sync and async tasks for artifact-like bundles, and async_inference/celery.py shouldn't be used anymore for the user containers at least (not sure about celery-forwarder though)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd also have to patch celery-forwarder to listen to sigterm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what's being used in async tasks?

This is what Frances's async endpoint uses

we'd also have to patch celery-forwarder to listen to sigterm

From the celery documentation: "When shutdown is initiated the worker will finish all currently executing tasks before it actually terminates" which I think means we're good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible to simulate such an scenario by sending traffic while restarting a pod?

args, extra_args = parser.parse_known_args()

# TODO: HTTPS
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -507,6 +507,7 @@ def get_endpoint_resource_arguments_from_request(
main_env = []
if isinstance(flavor, RunnableImageLike) and flavor.env:
main_env = [{"name": key, "value": value} for key, value in flavor.env.items()]
main_env.append({"name": "AWS_PROFILE", "value": build_endpoint_request.aws_role})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we put a test for this (to test various merging logic with user-provided aws profiles)


infra_service_config_volume_mount_path = "/infra-config"
forwarder_config_file_name = "service--forwarder.yaml"
Expand Down
1 change: 1 addition & 0 deletions model-engine/requirements-test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ pytest-mypy==0.9.1
pytest-mypy-plugins==1.10.1
pytest-asyncio==0.20.1
pytest-pylint==0.18.0
pylint<3.0.0
types-cachetools==5.3.0.5
types-croniter==1.4.0.0
types-PyYAML==6.0.7
Expand Down