You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+57-28Lines changed: 57 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,57 +130,86 @@ when using this approach ensure the value you are reporting accounts for concurr
130
130
131
131
### Summary
132
132
133
-
Summaries track the size and number of events.
133
+
Summaries and Histograms can both be used to monitor latencies (or other things like request sizes).
134
+
135
+
An overview of when to use Summaries and when to use Histograms can be found on [https://prometheus.io/docs/practices/histograms](https://prometheus.io/docs/practices/histograms).
136
+
137
+
The following example shows how to measure latencies and request sizes:
134
138
135
139
```java
136
140
classYourClass {
137
-
staticfinalSummary receivedBytes =Summary.build()
138
-
.name("requests_size_bytes").help("Request size in bytes.").register();
There are utilities for timing code and support for [quantiles](https://prometheus.io/docs/practices/histograms/#quantiles).
155
-
Essentially quantiles aren't aggregatable and add some client overhead for the calculation.
164
+
The `Summary` class provides different utility methods for observing values, like `observe(double)`, `startTimer(); timer.observeDuration()`, `time(Callable)`, etc.
165
+
166
+
By default, `Summary` metrics provide the `count` and the `sum`. For example, if you measure latencies of a REST service, the `count` will tell you how often the REST service was called, and the `sum` will tell you the total aggregated response time. You can calculate the average response time using a Prometheus query dividing `sum / count`.
167
+
168
+
In addition to `count` and `sum`, you can configure a Summary to provide quantiles:
.quantile(0.9, 0.01) // Add 90th percentile with 1% tolerated error
162
-
.name("requests_latency_seconds").help("Request latency in seconds.").register();
171
+
Summary requestLatency =Summary.build()
172
+
.name("requests_latency_seconds")
173
+
.help("Request latency in seconds.")
174
+
.quantile(0.5, 0.01) // 0.5 quantile (median) with 0.01 allowed error
175
+
.quantile(0.95, 0.005) // 0.95 quantile with 0.005 allowed error
176
+
// ...
177
+
.register();
178
+
```
163
179
164
-
voidprocessRequest(Requestreq) {
165
-
requestLatency.time(newRunnable() {
166
-
publicabstractvoidrun() {
167
-
// Your code here.
168
-
}
169
-
});
180
+
As an example, a `0.95` quantile of `120ms` tells you that `95%` of the calls were faster than `120ms`, and `5%` of the calls were slower than `120ms`.
170
181
182
+
Tracking exact quantiles require a large amount of memory, because all observations need to be stored in a sorted list. Therefore, we allow an error to significantly reduce memory usage.
171
183
172
-
// Or the Java 8 lambda equivalent
173
-
requestLatency.time(() -> {
174
-
// Your code here.
175
-
});
176
-
}
177
-
}
184
+
In the example, the allowed error of `0.005` means that you will not get the exact `0.95` quantile, but anything between the `0.945` quantile and the `0.955` quantile.
185
+
186
+
Experiments show that the `Summary` typically needs to keep less than 100 samples to provide that precision, even if you have hundreds of millions of observations.
187
+
188
+
There are a few special cases:
189
+
190
+
* You can set an allowed error of `0`, but then the `Summary` will keep all observations in memory.
191
+
* You can track the minimum value with `.quantile(0, 0)`. This special case will not use additional memory even though the allowed error is `0`.
192
+
* You can track the maximum value with `.quantile(1, 0)`. This special case will not use additional memory even though the allowed error is `0`.
193
+
194
+
Typically, you don't want to have a `Summary` representing the entire runtime of the application, but you want to look at a reasonable time interval. `Summary` metrics implement a configurable sliding time window:
195
+
196
+
```java
197
+
Summary requestLatency =Summary.build()
198
+
.name("requests_latency_seconds")
199
+
.help("Request latency in seconds.")
200
+
.maxAgeSeconds(10*60)
201
+
.ageBuckets(5)
202
+
// ...
203
+
.register();
178
204
```
179
205
206
+
The default is a time window of 10 minutes and 5 age buckets, i.e. the time window is 10 minutes wide, and * we slide it forward every 2 minutes.
207
+
180
208
### Histogram
181
209
182
-
Histograms track the size and number of events in buckets.
183
-
This allows for aggregatable calculation of quantiles.
210
+
Like Summaries, Histograms can be used to monitor latencies (or other things like request sizes).
211
+
212
+
An overview of when to use Summaries and when to use Histograms can be found on [https://prometheus.io/docs/practices/histograms](https://prometheus.io/docs/practices/histograms).
Copy file name to clipboardExpand all lines: simpleclient/src/main/java/io/prometheus/client/Summary.java
+82-49Lines changed: 82 additions & 49 deletions
Original file line number
Diff line number
Diff line change
@@ -13,70 +13,91 @@
13
13
importjava.util.concurrent.TimeUnit;
14
14
15
15
/**
16
-
* Summary metric, to track the size of events.
16
+
* {@link Summary} metrics and {@link Histogram} metrics can both be used to monitor latencies (or other things like request sizes).
17
17
* <p>
18
-
* Example of uses for Summaries include:
19
-
* <ul>
20
-
* <li>Response latency</li>
21
-
* <li>Request size</li>
22
-
* </ul>
23
-
*
18
+
* An overview of when to use Summaries and when to use Histograms can be found on <a href="https://prometheus.io/docs/practices/histograms">https://prometheus.io/docs/practices/histograms</a>.
24
19
* <p>
25
-
* Example Summaries:
20
+
* The following example shows how to measure latencies and request sizes:
21
+
*
26
22
* <pre>
27
-
* {@code
28
-
* class YourClass {
29
-
* static final Summary receivedBytes = Summary.build()
30
-
* .name("requests_size_bytes").help("Request size in bytes.").register();
31
-
* static final Summary requestLatency = Summary.build()
32
-
* .name("requests_latency_seconds").help("Request latency in seconds.").register();
* .quantile(0.95, 0.005) // 0.95 quantile with 0.005 allowed error
63
+
* // ...
64
+
* .register();
68
65
* </pre>
69
66
*
70
-
* The quantiles are calculated over a sliding window of time. There are two options to configure this time window:
67
+
* As an example, a 0.95 quantile of 120ms tells you that 95% of the calls were faster than 120ms, and 5% of the calls were slower than 120ms.
68
+
* <p>
69
+
* Tracking exact quantiles require a large amount of memory, because all observations need to be stored in a sorted list. Therefore, we allow an error to significantly reduce memory usage.
70
+
* <p>
71
+
* In the example, the allowed error of 0.005 means that you will not get the exact 0.95 quantile, but anything between the 0.945 quantile and the 0.955 quantile.
72
+
* <p>
73
+
* Experiments show that the {@link Summary} typically needs to keep less than 100 samples to provide that precision, even if you have hundreds of millions of observations.
74
+
* <p>
75
+
* There are a few special cases:
76
+
*
71
77
* <ul>
72
-
* <li>maxAgeSeconds(long): Set the duration of the time window is, i.e. how long observations are kept before they are discarded.
73
-
* Default is 10 minutes.
74
-
* <li>ageBuckets(int): Set the number of buckets used to implement the sliding time window. If your time window is 10 minutes, and you have ageBuckets=5,
75
-
* buckets will be switched every 2 minutes. The value is a trade-off between resources (memory and cpu for maintaining the bucket)
76
-
* and how smooth the time window is moved. Default value is 5.
78
+
* <li>You can set an allowed error of 0, but then the {@link Summary} will keep all observations in memory.</li>
79
+
* <li>You can track the minimum value with <tt>.quantile(0.0, 0.0)</tt>.
80
+
* This special case will not use additional memory even though the allowed error is 0.</li>
81
+
* <li>You can track the maximum value with <tt>.quantile(1.0, 0.0)</tt>.
82
+
* This special case will not use additional memory even though the allowed error is 0.</li>
77
83
* </ul>
78
84
*
79
-
* See https://prometheus.io/docs/practices/histograms/ for more info on quantiles.
85
+
* Typically, you don't want to have a {@link Summary} representing the entire runtime of the application,
86
+
* but you want to look at a reasonable time interval. {@link Summary} metrics implement a configurable sliding
87
+
* time window:
88
+
*
89
+
* <pre>
90
+
* Summary requestLatency = Summary.build()
91
+
* .name("requests_latency_seconds")
92
+
* .help("Request latency in seconds.")
93
+
* .maxAgeSeconds(10 * 60)
94
+
* .ageBuckets(5)
95
+
* // ...
96
+
* .register();
97
+
* </pre>
98
+
*
99
+
* The default is a time window of 10 minutes and 5 age buckets, i.e. the time window is 10 minutes wide, and
0 commit comments