Skip to content

Commit 079c20c

Browse files
authored
Initial docs pass (#2192)
1 parent 0cad67b commit 079c20c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+457
-3186
lines changed

dev/export_images.sh

Lines changed: 3 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -40,23 +40,10 @@ for image in "${all_images[@]}"; do
4040
done
4141
echo
4242

43-
cuda=("10.0" "10.1" "10.1" "10.2" "10.2" "11.0" "11.1")
44-
cudnn=("7" "7" "8" "7" "8" "8" "8")
45-
4643
# pull the images from source registry and push them to ECR
4744
for image in "${all_images[@]}"; do
48-
# copy the different cuda/cudnn variations of the python handler image
49-
if [ "$image" = "python-handler-gpu" ]; then
50-
for i in "${!cuda[@]}"; do
51-
full_image="$image:$cortex_version-cuda${cuda[$i]}-cudnn${cudnn[$i]}"
52-
echo "copying $full_image from $source_registry to $destination_registry"
53-
skopeo copy --src-no-creds "docker://$source_registry/$full_image" "docker://$destination_registry/$full_image"
54-
echo
55-
done
56-
else
57-
echo "copying $image:$cortex_version from $source_registry to $destination_registry"
58-
skopeo copy --src-no-creds "docker://$source_registry/$image:$cortex_version" "docker://$destination_registry/$image:$cortex_version"
59-
echo
60-
fi
45+
echo "copying $image:$cortex_version from $source_registry to $destination_registry"
46+
skopeo copy --src-no-creds "docker://$source_registry/$image:$cortex_version" "docker://$destination_registry/$image:$cortex_version"
47+
echo
6148
done
6249
echo "done ✓"

dev/python_version_test.sh

Lines changed: 0 additions & 47 deletions
This file was deleted.

docs/clusters/advanced/self-hosted-images.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Self-hosted Docker images
22

3-
Self-hosted Docker images can be useful for reducing the ingress costs, for accelerating image pulls, or for eliminating the dependency on Cortex's public container registry.
3+
Self-hosting the Cortex cluster's internal Docker images can be useful for reducing the ingress costs, for accelerating image pulls, or for eliminating the dependency on Cortex's public container registry.
44

55
In this guide, we'll use [ECR](https://aws.amazon.com/ecr/) as the destination container registry. When an ECR repository resides in the same region as your Cortex cluster, there are no costs incurred when pulling images.
66

@@ -33,7 +33,7 @@ Feel free to modify the script if you would like to export the images to a diffe
3333
./cortex/dev/export_images.sh <AWS_REGION> <AWS_ACCOUNT_ID>
3434
```
3535

36-
You can now configure Cortex to use your images when creating a cluster (see [here](../management/create.md) for how to specify cluster images) and/or when deploying APIs (see the configuration docs corresponding to your API type for how to specify API images).
36+
You can now configure Cortex to use your images when creating a cluster (see [here](../management/create.md) for instructions).
3737

3838
## Cleanup
3939

docs/clusters/instances/multi.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,15 @@ Cortex can be configured to provision different instance types to improve worklo
2020
node_groups:
2121
- name: cpu-spot
2222
instance_type: m5.large
23+
min_instances: 0
24+
max_instances: 5
2325
spot: true
2426
spot_config:
2527
instance_distribution: [m5a.large, m5d.large, m5n.large, m5ad.large, m5dn.large, m4.large, t3.large, t3a.large, t2.large]
2628
- name: cpu-on-demand
2729
instance_type: m5.large
30+
min_instances: 0
31+
max_instances: 5
2832
```
2933
3034
### On-demand cluster supporting CPU, GPU, and Inferentia
@@ -35,10 +39,16 @@ node_groups:
3539
node_groups:
3640
- name: cpu
3741
instance_type: m5.large
42+
min_instances: 0
43+
max_instances: 5
3844
- name: gpu
3945
instance_type: g4dn.xlarge
46+
min_instances: 0
47+
max_instances: 5
4048
- name: inf
4149
instance_type: inf.xlarge
50+
min_instances: 0
51+
max_instances: 5
4252
```
4353
4454
### Spot cluster supporting CPU and GPU (with on-demand backup)
@@ -49,16 +59,24 @@ node_groups:
4959
node_groups:
5060
- name: cpu-spot
5161
instance_type: m5.large
62+
min_instances: 0
63+
max_instances: 5
5264
spot: true
5365
spot_config:
5466
instance_distribution: [m5a.large, m5d.large, m5n.large, m5ad.large, m5dn.large, m4.large, t3.large, t3a.large, t2.large]
5567
- name: cpu-on-demand
5668
instance_type: m5.large
69+
min_instances: 0
70+
max_instances: 5
5771
- name: gpu-spot
5872
instance_type: g4dn.xlarge
73+
min_instances: 0
74+
max_instances: 5
5975
spot: true
6076
- name: gpu-on-demand
6177
instance_type: g4dn.xlarge
78+
min_instances: 0
79+
max_instances: 5
6280
```
6381
6482
### CPU spot cluster with multiple instance types and on-demand backup
@@ -69,13 +87,21 @@ node_groups:
6987
node_groups:
7088
- name: cpu-1
7189
instance_type: t3.medium
90+
min_instances: 0
91+
max_instances: 5
7292
spot: true
7393
- name: cpu-2
7494
instance_type: m5.2xlarge
95+
min_instances: 0
96+
max_instances: 5
7597
spot: true
7698
- name: cpu-3
7799
instance_type: m5.8xlarge
100+
min_instances: 0
101+
max_instances: 5
78102
spot: true
79103
- name: cpu-4
80104
instance_type: m5.24xlarge
105+
min_instances: 0
106+
max_instances: 5
81107
```

docs/clusters/instances/spot.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,8 @@ There is a spot instance limit associated with your AWS account for each instanc
4343
node_groups:
4444
- name: cpu-spot
4545
instance_type: m5.large
46+
min_instances: 0
47+
max_instances: 5
4648
spot: true
4749
spot_config:
4850
instance_distribution: [m5a.large, m5d.large, m5n.large, m5ad.large, m5dn.large, m4.large, t3.large, t3a.large, t2.large]

docs/clusters/management/create.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,6 @@ image_async_gateway: quay.io/cortexlabs/async-gateway:master
104104
image_cluster_autoscaler: quay.io/cortexlabs/cluster-autoscaler:master
105105
image_metrics_server: quay.io/cortexlabs/metrics-server:master
106106
image_inferentia: quay.io/cortexlabs/inferentia:master
107-
image_neuron_rtd: quay.io/cortexlabs/neuron-rtd:master
108107
image_nvidia: quay.io/cortexlabs/nvidia:master
109108
image_fluent_bit: quay.io/cortexlabs/fluent-bit:master
110109
image_istio_proxy: quay.io/cortexlabs/istio-proxy:master

docs/clusters/management/update.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ cortex cluster up cluster.yaml
2727
In production environments, you can upgrade your cluster without downtime if you have a backend service or DNS in front of your Cortex cluster:
2828

2929
1. Spin up a new cluster. For example: `cortex cluster up new-cluster.yaml --configure-env cortex2` (this will create a CLI environment named `cortex2` for accessing the new cluster).
30-
1. Re-deploy your APIs in your new cluster. For example, if the name of your CLI environment for your existing cluster is `cortex`, you can use `cortex get --env cortex` to list all running APIs in your cluster, and re-deploy them in the new cluster by changing directories to each API's project folder and running `cortex deploy --env cortex2`. Alternatively, you can run `cortex cluster export --name <previous_cluster_name> --region <region>` to export all of your API specifications, change directories the folder that was exported, and run `cortex deploy --env cortex2 <file_name>` for each API that you want to deploy in the new cluster.
30+
1. Re-deploy your APIs in your new cluster. For example, if the name of your CLI environment for your existing cluster is `cortex`, you can use `cortex get --env cortex` to list all running APIs in your cluster, and re-deploy them in the new cluster by running `cortex deploy --env cortex2` for each API. Alternatively, you can run `cortex cluster export --name <previous_cluster_name> --region <region>` to export the API specifications for all of your running APIs, change directories the folder that was exported, and run `cortex deploy --env cortex2 <file_name>` for each API that you want to deploy in the new cluster.
3131
1. Route requests to your new cluster.
3232
* If you are using a custom domain: update the A record in your Route 53 hosted zone to point to your new cluster's API load balancer.
3333
* If you have a backend service which makes requests to Cortex: update your backend service to make requests to the new cluster's endpoints.

docs/clusters/networking/custom-domain.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -115,13 +115,9 @@ You could run into connectivity issues if you make a request to your API without
115115

116116
To test connectivity, try the following steps:
117117

118-
1. Deploy any api (e.g. examples/pytorch/iris-classifier).
119-
1. Make a GET request to the your api (e.g. `curl https://api.cortexlabs.dev/iris-classifier` or paste the url into your browser).
120-
1. If you run into an error such as `curl: (6) Could not resolve host: api.cortexlabs.dev` wait a few minutes and make the GET request from another device that hasn't made a request to that url in a while. A successful request looks like this:
121-
122-
```text
123-
{"message":"make a request by sending a POST to this endpoint with a json payload",...}
124-
```
118+
1. Deploy an api.
119+
1. Make a request to the your api (e.g. `curl https://api.cortexlabs.dev/my-api` or paste the url into your browser if your API supports GET requests).
120+
1. If you run into an error such as `curl: (6) Could not resolve host: api.cortexlabs.dev` wait a few minutes and make the request from another device that hasn't made a request to that url in a while.
125121

126122
## Cleanup
127123

docs/clusters/networking/https.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,13 @@ Copy your "Invoke URL"
5656
You may now use the "Invoke URL" in place of your API load balancer endpoint in your client. For example, this curl request:
5757

5858
```bash
59-
curl http://a9eaf69fd125947abb1065f62de59047-81cdebc0275f7d96.elb.us-west-2.amazonaws.com/iris-classifier -X POST -H "Content-Type: application/json" -d @sample.json
59+
curl http://a9eaf69fd125947abb1065f62de59047-81cdebc0275f7d96.elb.us-west-2.amazonaws.com/my-api -X POST -H "Content-Type: application/json" -d @sample.json
6060
```
6161

6262
Would become:
6363

6464
```bash
65-
curl https://31qjv48rs6.execute-api.us-west-2.amazonaws.com/dev/iris-classifier -X POST -H "Content-Type: application/json" -d @sample.json
65+
curl https://31qjv48rs6.execute-api.us-west-2.amazonaws.com/dev/my-api -X POST -H "Content-Type: application/json" -d @sample.json
6666
```
6767

6868
### Cleanup
@@ -134,13 +134,13 @@ Copy your "Invoke URL"
134134
You may now use the "Invoke URL" in place of your API load balancer endpoint in your client. For example, this curl request:
135135

136136
```bash
137-
curl http://a5044e34a352d44b0945adcd455c7fa3-32fa161d3e5bcbf9.elb.us-west-2.amazonaws.com/iris-classifier -X POST -H "Content-Type: application/json" -d @sample.json
137+
curl http://a5044e34a352d44b0945adcd455c7fa3-32fa161d3e5bcbf9.elb.us-west-2.amazonaws.com/my-api -X POST -H "Content-Type: application/json" -d @sample.json
138138
```
139139

140140
Would become:
141141

142142
```bash
143-
curl https://lrivodooqh.execute-api.us-west-2.amazonaws.com/dev/iris-classifier -X POST -H "Content-Type: application/json" -d @sample.json
143+
curl https://lrivodooqh.execute-api.us-west-2.amazonaws.com/dev/my-api -X POST -H "Content-Type: application/json" -d @sample.json
144144
```
145145

146146
### Cleanup

docs/clusters/observability/logging.md

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -64,15 +64,3 @@ fields @timestamp, message
6464
| sort @timestamp asc
6565
| limit 1000
6666
```
67-
68-
## Structured logging
69-
70-
You can use Cortex's logger in your Python code to log in JSON, which will enrich your logs with Cortex's metadata, and
71-
enable you to add custom metadata to the logs.
72-
73-
See the structured logging docs for each API kind:
74-
75-
- [RealtimeAPI](../../workloads/realtime/handler.md#structured-logging)
76-
- [AsyncAPI](../../workloads/async/handler.md#structured-logging)
77-
- [BatchAPI](../../workloads/batch/handler.md#structured-logging)
78-
- [TaskAPI](../../workloads/task/definitions.md#structured-logging)

docs/clusters/observability/metrics.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -96,23 +96,23 @@ Currently, we only support 3 different metric types that will be converted to it
9696

9797
### Pushing metrics
9898

99-
- Counter
99+
- Counter
100100

101-
```python
102-
metrics.increment('my_counter', value=1, tags={"tag": "tag_name"})
103-
```
101+
```python
102+
metrics.increment('my_counter', value=1, tags={"tag": "tag_name"})
103+
```
104104

105-
- Gauge
105+
- Gauge
106106

107-
```python
108-
metrics.gauge('active_connections', value=1001, tags={"tag": "tag_name"})
109-
```
107+
```python
108+
metrics.gauge('active_connections', value=1001, tags={"tag": "tag_name"})
109+
```
110110

111-
- Histogram
111+
- Histogram
112112

113-
```python
114-
metrics.histogram('inference_time_milliseconds', 120, tags={"tag": "tag_name"})
115-
```
113+
```python
114+
metrics.histogram('inference_time_milliseconds', 120, tags={"tag": "tag_name"})
115+
```
116116

117117
### Metrics client class reference
118118

docs/start.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ cortex cluster up cluster.yaml
2121
cortex deploy apis.yaml
2222
```
2323

24-
* [RealtimeAPI](workloads/realtime/example.md) - create HTTP/gRPC APIs that respond to requests in real-time.
24+
* [RealtimeAPI](workloads/realtime/example.md) - create APIs that respond to requests in real-time.
2525
* [AsyncAPI](workloads/async/example.md) - create APIs that respond to requests asynchronously.
2626
* [BatchAPI](workloads/batch/example.md) - create APIs that run distributed batch jobs.
2727
* [TaskAPI](workloads/task/example.md) - create APIs that run jobs on-demand.

docs/summary.md

Lines changed: 4 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -29,52 +29,32 @@
2929

3030
## Workloads
3131

32-
* Realtime APIs
32+
* [Realtime APIs](workloads/realtime/realtime-apis.md)
3333
* [Example](workloads/realtime/example.md)
34-
* [Handler](workloads/realtime/handler.md)
3534
* [Configuration](workloads/realtime/configuration.md)
36-
* [Parallelism](workloads/realtime/parallelism.md)
3735
* [Autoscaling](workloads/realtime/autoscaling.md)
38-
* [Models](workloads/realtime/models.md)
39-
* Multi-model
40-
* [Example](workloads/realtime/multi-model/example.md)
41-
* [Configuration](workloads/realtime/multi-model/configuration.md)
42-
* [Caching](workloads/realtime/multi-model/caching.md)
43-
* [Server-side batching](workloads/realtime/server-side-batching.md)
36+
* [Traffic Splitter](workloads/realtime/traffic-splitter.md)
4437
* [Metrics](workloads/realtime/metrics.md)
4538
* [Statuses](workloads/realtime/statuses.md)
46-
* Traffic Splitter
47-
* [Example](workloads/realtime/traffic-splitter/example.md)
48-
* [Configuration](workloads/realtime/traffic-splitter/configuration.md)
4939
* [Troubleshooting](workloads/realtime/troubleshooting.md)
5040
* [Async APIs](workloads/async/async-apis.md)
5141
* [Example](workloads/async/example.md)
52-
* [Handler](workloads/async/handler.md)
5342
* [Configuration](workloads/async/configuration.md)
54-
* [TensorFlow Models](workloads/async/models.md)
5543
* [Metrics](workloads/async/metrics.md)
5644
* [Statuses](workloads/async/statuses.md)
5745
* [Webhooks](workloads/async/webhooks.md)
58-
* Batch APIs
46+
* [Batch APIs](workloads/batch/batch-apis.md)
5947
* [Example](workloads/batch/example.md)
60-
* [Handler](workloads/batch/handler.md)
6148
* [Configuration](workloads/batch/configuration.md)
6249
* [Jobs](workloads/batch/jobs.md)
63-
* [TensorFlow Models](workloads/batch/models.md)
6450
* [Metrics](workloads/batch/metrics.md)
6551
* [Statuses](workloads/batch/statuses.md)
66-
* Task APIs
52+
* [Task APIs](workloads/task/task-apis.md)
6753
* [Example](workloads/task/example.md)
68-
* [Definition](workloads/task/definitions.md)
6954
* [Configuration](workloads/task/configuration.md)
7055
* [Jobs](workloads/task/jobs.md)
7156
* [Metrics](workloads/task/metrics.md)
7257
* [Statuses](workloads/task/statuses.md)
73-
* Dependencies
74-
* [Example](workloads/dependencies/example.md)
75-
* [Python packages](workloads/dependencies/python-packages.md)
76-
* [System packages](workloads/dependencies/system-packages.md)
77-
* [Custom images](workloads/dependencies/images.md)
7858

7959
## Clients
8060

docs/workloads/async/async-apis.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# AsyncAPI
1+
# Async APIs
22

33
The AsyncAPI kind is designed for asynchronous workloads, in which the user submits a request to start the processing
44
and retrieves the result later, either by polling or through a webhook.
@@ -14,7 +14,3 @@ workload status and results. Cortex fully manages the Async Gateway and the queu
1414

1515
AsyncAPI is a good fit for users who want to submit longer workloads (such as video, audio
1616
or document processing), and do not need the result immediately or synchronously.
17-
18-
{% hint style="info" %}
19-
AsyncAPI is still in a beta state.
20-
{% endhint %}

0 commit comments

Comments
 (0)