[Bug Report] You are forcing Jumpstart to use ml.p4d.24xlarge even when instance_type is specified

**Link to the notebook**
In the code below I am clearly passing a different instance type where I want to deploy my trained moedl
```
finetuned_predictor = estimator.deploy(
    instance_type='ml.g5.48xlarge', #NOTE: It is ingorring the instance I pass here and always deploying with ml.p4d.24xlarge which by the way is the most expensive possible
    tags=deployment_endpoint_tags,
    endpoint_name=desired_endpoint_name
)
```
**Describe the bug**

You are forcing Jumpstart to use ml.p4d.24xlarge even when instance_type is specified. It is happening under `JumpstartEstimator `class in the `deploy `method when ``get_deploy_kwargs` is passed. I needed to do this to fix.


```
estimator_deploy_kwargs = get_deploy_kwargs(
            model_id=self.model_id,
            model_version=self.model_version,
            region=self.region,
            tolerate_vulnerable_model=self.tolerate_vulnerable_model,
            tolerate_deprecated_model=self.tolerate_deprecated_model,
            initial_instance_count=initial_instance_count,
            instance_type=instance_type,
            serializer=serializer,
            deserializer=deserializer,
            accelerator_type=accelerator_type,
            endpoint_name=endpoint_name,
            tags=format_tags(tags),
            kms_key=kms_key,
            wait=wait,
            data_capture_config=data_capture_config,
            async_inference_config=async_inference_config,
            serverless_inference_config=serverless_inference_config,
            volume_size=volume_size,
            model_data_download_timeout=model_data_download_timeout,
            container_startup_health_check_timeout=container_startup_health_check_timeout,
            inference_recommendation_id=inference_recommendation_id,
            explainer_config=explainer_config,
            image_uri=image_uri,
            role=role,
            predictor_cls=predictor_cls,
            env=env,
            model_name=model_name,
            vpc_config=vpc_config,
            sagemaker_session=sagemaker_session,
            enable_network_isolation=enable_network_isolation,
            model_kms_key=model_kms_key,
            image_config=image_config,
            source_dir=source_dir,
            code_location=code_location,
            entry_point=entry_point,
            container_log_level=container_log_level,
            dependencies=dependencies,
            git_config=git_config,
            use_compiled_model=use_compiled_model,
            training_instance_type=self.instance_type,
        )

        # NOTE: Done by Matheus.
        estimator_deploy_kwargs.instance_type = instance_type
```



**To reproduce**
Just train or load a model that is in jumpstart (in my case it was a llama3 70B one) and try to deploy it.
You probably have this check because it is a big mode, but other machines are also able to deploy this model. But if the user is passing a different `instance_type` you should at least let them try and if a future error happens it is because the machine did not supported that model.

**Logs**
If applicable, add logs to help explain your problem.


`ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateEndpoint operation: The account-level service limit 'ml.p4d.24xlarge for endpoint usage' is 2 Instances, with current utilization of 2 Instances and a request delta of 1 Instances. Please use AWS Service Quotas to request an increase for this quota. If AWS Service Quotas is not available, contact AWS support to request an increase for this quota.`

But as showed in my code, I wanna deploy `ml.g5.48xlarge`

****

I am not authtorized to share the notebook here publicly, please reach out for that.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug Report] You are forcing Jumpstart to use ml.p4d.24xlarge even when instance_type is specified #4666

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug Report] You are forcing Jumpstart to use ml.p4d.24xlarge even when instance_type is specified #4666

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions