Skip to content

Commit 4933d05

Browse files
committed
Update metric_types.md for native histograms
Note that I used this opportunity to replace the term "client library" with "instrumentation library". I always thought that "client library" is confusing as it is not implementing a client in any way. (Technically, it implements a _server_, of which the Prometheus "server" is the client… 🤯) Even if we accept that "Prometheus client library" just means "a library to do something that has to do with Prometheus", the title "client library" still doesn't tell us what the library is actually for. (Note that the client_golang repository not only contains an instrumentation library, but also includes an _actual_ client library that helps you to implement clients that talk to the Prometheus HTTP API.) Signed-off-by: beorn7 <beorn@grafana.com>
1 parent 469f1f0 commit 4933d05

File tree

2 files changed

+91
-38
lines changed

2 files changed

+91
-38
lines changed

docs/concepts/metric_types.md

Lines changed: 88 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,17 @@ title: Metric types
33
sort_rank: 2
44
---
55

6-
The Prometheus client libraries offer four core metric types. These are
7-
currently only differentiated in the client libraries (to enable APIs tailored
8-
to the usage of the specific types) and in the wire protocol. The Prometheus
9-
server does not yet make use of the type information and flattens all data into
10-
untyped time series. This may change in the future.
6+
The Prometheus instrumentation libraries offer four core metric types. With the
7+
exception of native histograms, these are currently only differentiated in the
8+
API of instrumentation libraries and in the exposition protocols.
9+
The Prometheus server does not yet make
10+
use of the type information and flattens all types except native histograms
11+
into untyped time series of floating point values. Native histograms, however,
12+
are ingested as time series of special composite histogram samples. In the
13+
future, Prometheus might handle other metric types as [composite
14+
types](/blog/2026/02/14/modernizing-prometheus-composite-samples/), too. There
15+
is also ongoing work to persist the type information of the simple float
16+
samples.
1117

1218
## Counter
1319

@@ -20,7 +26,7 @@ errors.
2026
Do not use a counter to expose a value that can decrease. For example, do not
2127
use a counter for the number of currently running processes; instead use a gauge.
2228

23-
Client library usage documentation for counters:
29+
Instrumentation library usage documentation for counters:
2430

2531
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Counter)
2632
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#counter)
@@ -38,7 +44,7 @@ Gauges are typically used for measured values like temperatures or current
3844
memory usage, but also "counts" that can go up and down, like the number of
3945
concurrent requests.
4046

41-
Client library usage documentation for gauges:
47+
Instrumentation library usage documentation for gauges:
4248

4349
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Gauge)
4450
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#gauge)
@@ -49,39 +55,86 @@ Client library usage documentation for gauges:
4955

5056
## Histogram
5157

52-
A _histogram_ samples observations (usually things like request durations or
53-
response sizes) and counts them in configurable buckets. It also provides a sum
54-
of all observed values.
55-
56-
A histogram with a base metric name of `<basename>` exposes multiple time series
57-
during a scrape:
58-
59-
* cumulative counters for the observation buckets, exposed as `<basename>_bucket{le="<upper inclusive bound>"}`
58+
A _histogram_ records observations (usually things like request durations or
59+
response sizes) by counting them in configurable buckets. It also provides a sum
60+
of all observed values. As such, a histogram is essentially a bucketed counter.
61+
However, a histogram can also represent the current state of a distribution, in
62+
which case it is called a _gauge histogram_. In contrast to the usual
63+
counter-like histograms, gauge histograms are rarely directly exposed by
64+
instrumented programs and are thus not (yet) usable in instrumentation
65+
libraries, but they are represented in newer versions of the protobuf
66+
exposition format and in [OpenMetrics](https://openmetrics.io/). They are also
67+
created regularly by PromQL expressions. For example, the outcome of applying
68+
the `rate` function to a counter histogram is a gauge histogram, in the same
69+
way as the outcome of applying the `rate` function to a counter is a gauge.
70+
71+
Histograms exists in two fundamentally different versions: The more recent
72+
_native histograms_ and the older _classic histograms_.
73+
74+
A native histogram is exposed and ingested as composite samples, where each
75+
sample represents the count and sum of observations together with a dynamic set
76+
of buckets.
77+
78+
A classic histogram, however, consists of multiple time series of simple float
79+
samples. A classic histogram with a base metric name of `<basename>` results in
80+
the following time series:
81+
82+
* cumulative counters for the observation buckets, exposed as
83+
`<basename>_bucket{le="<upper inclusive bound>"}`
6084
* the **total sum** of all observed values, exposed as `<basename>_sum`
61-
* the **count** of events that have been observed, exposed as `<basename>_count` (identical to `<basename>_bucket{le="+Inf"}` above)
62-
63-
Use the
64-
[`histogram_quantile()` function](/docs/prometheus/latest/querying/functions/#histogram_quantile)
65-
to calculate quantiles from histograms or even aggregations of histograms. A
66-
histogram is also suitable to calculate an
67-
[Apdex score](http://en.wikipedia.org/wiki/Apdex). When operating on buckets,
68-
remember that the histogram is
69-
[cumulative](https://en.wikipedia.org/wiki/Histogram#Cumulative_histogram). See
70-
[histograms and summaries](/docs/practices/histograms) for details of histogram
71-
usage and differences to [summaries](#summary).
72-
73-
NOTE: Beginning with Prometheus v2.40, there is experimental support for native
74-
histograms. A native histogram requires only one time series, which includes a
75-
dynamic number of buckets in addition to the sum and count of
76-
observations. Native histograms allow much higher resolution at a fraction of
77-
the cost. Detailed documentation will follow once native histograms are closer
78-
to becoming a stable feature.
85+
* the **count** of events that have been observed, exposed as
86+
`<basename>_count` (identical to `<basename>_bucket{le="+Inf"}` above)
87+
88+
Native histograms are generally much more efficient than classic histograms,
89+
allow much higher resolution, do not require explicit configuration of bucket
90+
boundaries during instrumentation, and provide atomicity when transferred over
91+
the network (e.g. via the Prometheus remote write procol, where classic
92+
histograms suffer from possible partial transfer because their constituent time
93+
series are transferred independently). Their bucketing schema ensures that they
94+
are always aggregatable with each other, even if the resolution might have
95+
changed, while classic histograms with different bucket boundaries are not
96+
generally aggregatable. If the instrumentation library you are using supports
97+
native histograms (currently this is the case for Go and Java), you should
98+
probably [prefer native histograms over classic
99+
histograms](/docs/practices/histograms).
100+
101+
If you are stuck with classic histograms for whatever reason, there is a way to
102+
get at least some of the benefits of native histograms: You can configure
103+
Prometheus to ingest classic histograms into a special form of native
104+
histograms, called Native Histograms with Custom Bucket boundaries (NHCB).
105+
NHCBs are stored as the same composite samples as usual native histograms,
106+
providing increased efficiency and atomic network transfers, similar to regular
107+
native histgorms. However, the buckets of NHCBs still have the same layout as
108+
in their classic counterparts, statically configured during instrumentation,
109+
with the same limited resolution and range and the same problems of
110+
aggregatability upon changing the bucket boundaries.
111+
112+
Use the [`histogram_quantile()`
113+
function](/docs/prometheus/latest/querying/functions/#histogram_quantile) to
114+
calculate quantiles from histograms or even aggregations of histograms. It
115+
works for both classic and native histograms, using a slightly different
116+
syntax. Histograms are also suitable to calculate an [Apdex
117+
score](http://en.wikipedia.org/wiki/Apdex).
118+
119+
You can operate directly on the buckets of a classic histogram, as they are
120+
represented as individual series (called `<basename>_bucket{le="<upper
121+
inclusive bound>"}` as described above). Remember, however, that these buckets
122+
are [cumulative](https://en.wikipedia.org/wiki/Histogram#Cumulative_histogram),
123+
i.e. every bucket counts all observations less than or equal to the upper
124+
boundary provided as a label. With native histograms, you can look at
125+
observations within given boundaries with the [`histogram_fraction()`
126+
function](/docs/prometheus/latest/querying/functions/#histogram_fraction) (to
127+
calculate fractions of observations) and the [trim operators]() (to filter for
128+
the desired band of observations).
129+
130+
See [histograms and summaries](/docs/practices/histograms) for details of
131+
histogram usage and differences to [summaries](#summary).
79132

80133
NOTE: Beginning with Prometheus v3.0, the values of the `le` label of classic
81134
histograms are normalized during ingestion to follow the format of
82135
[OpenMetrics Canonical Numbers](https://github.com/prometheus/OpenMetrics/blob/main/specification/OpenMetrics.md#considerations-canonical-numbers).
83136

84-
Client library usage documentation for histograms:
137+
Instrumentation library usage documentation for histograms:
85138

86139
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Histogram)
87140
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#histogram)
@@ -111,7 +164,7 @@ to [histograms](#histogram).
111164
NOTE: Beginning with Prometheus v3.0, the values of the `quantile` label are normalized during
112165
ingestion to follow the format of [OpenMetrics Canonical Numbers](https://github.com/prometheus/OpenMetrics/blob/main/specification/OpenMetrics.md#considerations-canonical-numbers).
113166

114-
Client library usage documentation for summaries:
167+
Instrumentation library usage documentation for summaries:
115168

116169
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Summary)
117170
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#summary)

docs/practices/histograms.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -406,9 +406,9 @@ Classic histogram version:
406406
histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket[5m]))) // GOOD.
407407

408408
Furthermore, should your SLO change and you now want to plot the 90th
409-
percentile, or you want to take into account the last 10 minutes
410-
instead of the last 5 minutes, you only have to adjust the expressions
411-
above and you do not need to reconfigure the clients.
409+
percentile, or you want to take into account the last 10 minutes instead of the
410+
last 5 minutes, you only have to adjust the expressions above and you do not
411+
need to reconfigure the instrumentation of the monitored programs.
412412

413413
### Errors of quantile estimation
414414

0 commit comments

Comments
 (0)