-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Is your feature request related to a problem? Please describe.
We’ve had a handful of reports around old models and endpoint configs getting reused even though the customer wanted to deploy to a new instance type (e.g. #987). In addition, simply retrying a deploy()
call fails if there are leftover resources from a failed deployment. While this can be handled with currently by specifying update_endpoint=True
and/or providing a new endpoint name, this workflow is hard to discover and a little clunky.
Describe the solution you'd like
Let's flip the default to be creating new resources with new generated names each time. This should support the following:
estimator.deploy(1, "ml.p3.16xlarge")
# user runs into account limit, so tries a smaller instance type
estimator.deploy(1, "ml.p3.2xlarge")
The update_endpoint
parameter becomes needed only for when a user specifically wants to update an existing endpoint. We can move this logic of updating existing model and endpoint resources to the model and predictor classes, respectively.
Describe alternatives you've considered
There are some alternative proposals in #987.