Skip to content

Commit d4d6946

Browse files
authored
update docs for next release (#73)
1 parent dc27415 commit d4d6946

File tree

3 files changed

+105
-33
lines changed

3 files changed

+105
-33
lines changed

docs/advanced-monitoring.rst

Lines changed: 80 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,17 +23,66 @@ in your ``values.yaml``:
2323

2424
.. code-block:: yaml
2525
26-
tracing_sampling_rate: 0.01
26+
envoy:
27+
tracing_sampling_rate: 0.01
2728
2829
opentelemetry-collector:
2930
enabled: true
31+
config:
32+
exporters:
33+
otlp:
34+
endpoint: http://supersonic-tempo:4317
35+
otlphttp:
36+
endpoint: http://supersonic-tempo:4318
37+
prometheusremotewrite:
38+
endpoint: http://supersonic-prometheus-server:9090/api/v1/write
3039
3140
tempo:
3241
enabled: true
42+
tempo:
43+
metricsGenerator:
44+
enabled: true
45+
remoteWriteUrl: http://supersonic-prometheus-server:9090/api/v1/write
46+
47+
.. note::
48+
49+
In the example above, endpoints and remote write URLs are configured to point to
50+
the Prometheus server and Grafana Tempo services, which will most likely
51+
have names like ``<release-name>-prometheus-server`` and ``<release-name>-tempo``.
3352

3453
The ``tracing_sampling_rate`` parameter controls how frequently requests are
3554
traced. A value of ``0.01`` means that one in 100 requests will be traced.
3655

56+
Additionally, you will need to enable tracing in Triton Inference Server, which is
57+
done by passing additional flags to ``tritonserver`` command. The following example
58+
shows how tracing is configured for CMS SuperSONIC instance:
59+
60+
.. code-block:: bash
61+
62+
/opt/tritonserver/bin/tritonserver \
63+
--model-repository=/cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre7/external/el9_amd64_gcc12/data/RecoBTag/Combined/data/models/ \
64+
--model-repository=/cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre7/external/el9_amd64_gcc12/data/RecoTauTag/TrainingFiles/data/DeepTauIdSONIC/ \
65+
--model-repository=/cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre7/external/el9_amd64_gcc12/data/RecoMET/METPUSubtraction/data/models/ \
66+
--model-repository=/cvmfs/cms.cern.ch/el9_amd64_gcc12/cms/cmssw/CMSSW_14_1_0_pre7/external/el9_amd64_gcc12/data/RecoEgamma/EgammaPhotonProducers/data/models/ \
67+
--trace-config mode=opentelemetry \
68+
--trace-config=opentelemetry,resource=pod_name=$(hostname) \
69+
--trace-config opentelemetry,url=supersonic-opentelemetry-collector:4318/v1/traces \
70+
--trace-config rate=100 \
71+
--trace-config level=TIMESTAMPS \
72+
--trace-config count=-1 \
73+
--allow-gpu-metrics=true \
74+
--log-verbose=0 \
75+
--strict-model-config=false \
76+
--exit-timeout-secs=60
77+
78+
.. note::
79+
80+
In the example above, the url should point to the OpenTelemetry Collector service,
81+
which will most likely have a name ``<release-name>-opentelemetry-collector``.
82+
83+
For tracing in Triton, the rate is the inverse of the ``tracing_sampling_rate``
84+
parameter in the Envoy Proxy configuration: rate=100 means 1% of requests will be traced.
85+
3786
.. warning::
3887

3988
Triton Inference Server supports OpenTelemetry tracing only in versions 24.x or later.
@@ -43,7 +92,36 @@ Displaying Tracing Data in Grafana
4392

4493
If Grafana is enabled in your ``values.yaml``, you can display the tracing data
4594
in the Grafana dashboard. In order to achieve this, Grafana needs to have a
46-
Tempo datasource configured.
95+
Tempo datasource configured:
96+
97+
.. code-block:: yaml
98+
99+
grafana:
100+
enabled: true
101+
datasources:
102+
datasources.yaml:
103+
datasources:
104+
- name: prometheus
105+
type: prometheus
106+
access: proxy
107+
isDefault: true
108+
url: http://supersonic-prometheus-server:9090
109+
jsonData:
110+
timeInterval: "5s"
111+
tlsSkipVerify: true
112+
- name: tempo
113+
type: tempo
114+
url: http://supersonic-tempo:3100
115+
access: proxy
116+
isDefault: false
117+
basicAuth: false
118+
jsonData:
119+
timeInterval: "5s"
120+
tlsSkipVerify: true
121+
serviceMap:
122+
datasourceUid: "prometheus"
123+
nodeGraph:
124+
enabled: true
47125
48126
If OpenTelemetry Collector and Tempo are enabled, the default Grafana dashboard
49127
will include an interactive server map, where you can study tracing data in detail

docs/configuration-guide.rst

Lines changed: 21 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -107,22 +107,9 @@ Triton version must be specified in the ``triton.image`` parameter in the values
107107
cpu: 2
108108
memory: 16G
109109
110-
- In addition, you can use ``triton.affinity`` to steer Triton pods to nodes with specific GPU models:
111-
112-
.. code-block:: yaml
113-
114-
affinity:
115-
nodeAffinity:
116-
requiredDuringSchedulingIgnoredDuringExecution:
117-
nodeSelectorTerms:
118-
- matchExpressions:
119-
- key: nvidia.com/gpu.product
120-
operator: In
121-
values:
122-
- NVIDIA-A10
123-
- NVIDIA-A40
124-
- NVIDIA-L40
125-
- NVIDIA-L4
110+
- In addition, you can use ``triton.nodeSelector``, ``triton.tolerations``,
111+
``triton.annotations``, and ``triton.affinity`` to steer Triton pods to specific nodes.
112+
This is particularly useful for co-locating Triton pods with Envoy proxy to reduce latency.
126113

127114

128115
4. Configure Envoy Proxy
@@ -242,9 +229,12 @@ Prometheus is needed to scrape metrics for monitoring, as well as for the rate l
242229
server:
243230
ingress:
244231
enabled: true
245-
hostName: "<prometheus_url>"
246232
ingressClassName: "<ingress_class>"
247-
annotations: {}
233+
hosts:
234+
- "<prometheus_url>"
235+
tls:
236+
- hosts:
237+
- "<prometheus_url>"
248238
249239
The parameters you will most likely need to configure in your values file are related to
250240
Ingress for web access to Prometheus UI.
@@ -297,7 +287,13 @@ Grafana is used to visualize metrics collected by Prometheus.
297287
We provide a pre-configured Grafana dashboard which includes many useful metrics,
298288
including latency breakdown, GPU utilization, and more.
299289

300-
Grafana is installed as a subchart with most of the default values pre-configured.
290+
If you have a Grafana instance already installed, you can deploy SuperSONIC dashboars
291+
by copying one of the JSON files from the
292+
`SuperSONIC repository <https://github.com/fastmachinelearning/SuperSONIC/tree/main/helm/supersonic/dashboards>`_.
293+
294+
If you don't have a Grafana instance already installed, you can deploy one as a subchart of SuperSONIC,
295+
in which case the dashboard will be automatically deployed.
296+
301297
You can further customize the Grafana installation by passing parameters from
302298
official Grafana `values.yaml <https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml>`_ file
303299
under the ``grafana`` section of the SuperSONIC values file:
@@ -308,9 +304,12 @@ under the ``grafana`` section of the SuperSONIC values file:
308304
enabled: true
309305
ingress:
310306
enabled: true
311-
hostName: "<grafana_url>"
312307
ingressClassName: "<ingress_class>"
313-
annotations: {}
308+
hosts:
309+
- "<grafana_url>"
310+
tls:
311+
- hosts:
312+
- "<grafana_url>"
314313
315314
The values you will most likely need to configure in your values file are related to
316315
Grafana Ingress for web access, and datasources to connect to Prometheus,
@@ -320,7 +319,7 @@ Grafana Ingress for web access, and datasources to connect to Prometheus,
320319
:height: 200
321320
:alt: SuperSONIC Grafana Dashboard
322321

323-
10. (Optional) Enable KEDA Autoscaler
322+
10. Enable KEDA Autoscaler
324323
==========================================
325324

326325
Autoscaling is implemented via `KEDA (Kubernetes Event-Driven Autoscaler) <https://keda.sh/>`_ and

docs/getting-started.rst

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,17 +33,12 @@ Installation
3333
helm install <release-name> fastml/supersonic -n <namespace> -f <your-values.yaml>
3434
3535
Use a unique meaningful lowercase value as <release-name>, for example
36-
``supersonic-cms-run3``.
37-
This value will be used as a prefix for all resources created by the chart,
38-
unless ``nameOverride`` is specified in the values file.
36+
``supersonic-cms``.
37+
This value will be used as a prefix for all resources created by the chart.
3938

40-
Successfully executed ``helm install`` command will print a link to auto-generated Grafana dashboard
39+
Successfully executed ``helm install`` command will print a link to a Grafana dashboard
4140
and other useful information.
42-
43-
.. figure:: img/grafana.png
44-
:align: center
45-
:height: 250
46-
:alt: Supersonic Grafana dashboard
41+
4742

4843
Uninstall SuperSONIC
4944
~~~~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)