**What did you find confusing? Please describe.** I was trying to investigate how I could load the model already when the inference endpoint starts, so that there wouldn't be any delay in the first request when the model needs to be loaded. I was able to find support for this by reading the code: https://github.com/aws/sagemaker-inference-toolkit/blob/master/src/sagemaker_inference/transformer.py#L200. There was nothing mentioned about this functionality in the README of this repo. **Describe how documentation can be improved** Document the `default_pre_model_fn` and `default_model_warmup_fn` into the README. Thanks!