You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/specs/om/open_metrics_spec_2_0.md
+40-29Lines changed: 40 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -109,6 +109,7 @@ Strings MUST only consist of valid UTF-8 characters and MAY be zero length. NULL
109
109
110
110
Labels are key-value pairs consisting of strings.
111
111
112
+
// TODO: I think one underscore as old is better? Sounds like there wasn't particular reason for "two"
112
113
Label names beginning with two underscores are RESERVED and MUST NOT be used unless specified by this standard. Such Label names MAY be used in place of TYPE and UNIT metadata in cases where MetricFamilies' metadata might otherwise be conflicting, such as metric federation cases.
113
114
114
115
// MAYBE: Link to where we explain "UTF-8 metrics may reduce usability"
@@ -234,9 +235,9 @@ A StateSet is structured as a set of Metrics, one for each state, called a State
234
235
235
236
> NOTE: In OpenMetrics 1.0, Metrics are composed of MetricPoints (e.g. a Histogram metric has a MetricPoint representing each Bucket with a special "le" label), which is no longer the case in OpenMetrics 2.0. An OpenMetrics 1.0 StateSet Metric is equivalent to an OpenMetrics 2.0 StateSet MetricGroup, and an OpenMetrics 1.0 StateSet MetricPoint is equivalent to an OpenMetrics 2.0 StateSet Metric.
236
237
237
-
A StateSet MetricGroup contains one or more states and MUST contain one boolean per state. States have a name which is a String.
238
+
A StateSet MetricGroup contains one or more states and MUST contain one Metric with a boolean value per state. States have a name which is a String.
238
239
239
-
If encoded as a StateSet, ENUMs MUST have exactly one Sample which is `1` (true) within a MetricGroup.
240
+
If encoded as a StateSet, ENUMs MUST have exactly one Sample which is `1` (true) within a MetricGroup, for a single Timestamp.
240
241
241
242
This is suitable where the enum value changes over time, and the number of States isn't much more than a handful.
242
243
@@ -246,10 +247,10 @@ MetricFamilies of Type StateSets MUST have an empty Unit string.
246
247
247
248
Info metrics are used to expose textual information which SHOULD NOT change during process lifetime. Common examples are an application's version, revision control commit, and the version of a compiler.
248
249
250
+
// CONSISTENCY: Last pass of Name/name (definition)
249
251
The MetricFamily name for Info metrics MUST end in `_info`.
250
252
251
-
// TODO: adjust as per https://github.com/prometheus/docs/pull/2894/changes#r2940458234
252
-
Info MAY be used to encode ENUMs whose values do not change over time, such as the type of a network interface.
253
+
// Likely to kill or example: Info MAY be used to encode ENUMs whose values do not change over time, such as the type of a network interface.
253
254
254
255
MetricFamilies of Type Info MUST have an empty Unit string.
255
256
@@ -263,19 +264,21 @@ The Count value MUST be equal to the number of measurements taken by the Histogr
263
264
264
265
Float Count is allowed to make it possible to expose results of arithmetic operations on histograms, such as addition that may result in values beyond the range of integers.
265
266
266
-
The Sum value MUST be equal to the sum of all the measured event values. The Sum is only a counter semantically as long as there are no negative event values measured by the Histogram Sample.
267
+
The Sum value MUST be equal to the sum of all the measured event values. The Sum is only a counter semantically as long as there are no negative event values measured by the Histogram.
267
268
268
269
A Histogram MUST measure values that are not NaN in either [Classic Buckets](#classic-buckets) or [Native Buckets](#native-buckets) or both. Measuring NaN is different for Classic and Native Buckets, see in their respective sections.
269
270
271
+
// MAYBE: DRY? common pattern with count
270
272
Every Bucket MUST have well-defined boundaries and a value. The bucket value is called the bucket count colloquially. Boundaries of a Bucket MUST NOT be NaN. Bucket values are counters semantically. Bucket values SHOULD be integers. Bucket values MUST NOT be negative. Bucket values SHOULD NOT be +Inf, NaN.
271
273
272
274
Float bucket values are allowed to make it possible to expose results of arithmetic operations on histograms, such as addition that may result in values beyond the range of integers.
273
275
274
276
A Histogram SHOULD NOT include NaN measurements as including NaN in the Sum will make the Sum equal to NaN and mask the sum of the real measurements for the lifetime of the time series. If a Histogram includes NaN measurements, then NaN measurements MUST be counted in the Count and the Sum MUST be NaN.
275
277
276
-
If a Histogram includes +Inf or -Inf measurement, then +Inf or -Inf MUST be counted in Count and MUST be added to the Sum, potentially resulting in +Inf, -Inf or NaN in the Sum, the later for example in case of adding +Inf to -Inf. Note that in this case the Sum of finite measurements is masked until the next reset of the Histogram.
278
+
If a Histogram includes +Inf or -Inf measurement, then +Inf or -Inf MUST be counted in Count and MUST be added to the Sum, potentially resulting in +Inf, -Inf or NaN in the Sum, the latter for example in case of adding +Inf to -Inf. Note that in this case the Sum of finite measurements is masked until the next reset of the Histogram.
277
279
278
-
A Histogram Sample SHOULD have a Timestamp value called Start Timestamp. This can help ingestors discern between new metrics and long-running ones it did not see before.
280
+
// TODO: Define Explicit Timestamp and Start Timestamp semantics (section).
281
+
A Histogram Sample SHOULD have a Start Timestamp. This can help ingestors discern between new metrics and long-running ones it did not see before.
279
282
280
283
If the Histogram Metric has Samples with Classic Buckets, the Histogram's Metric's LabelSet MUST NOT have a "le" label name, because in case the Samples are stored as classic histogram series with the `_bucket` suffix, then the "le" label in the Histogram will conflict with the "le" label generated from the bucket thresholds.
281
284
@@ -287,9 +290,10 @@ A Histogram Sample MAY have exemplars. The values of exemplars in a Histogram Sa
287
290
288
291
Every Classic Bucket MUST have a threshold. Classic Bucket thresholds within a Sample MUST be unique. Classic Bucket thresholds MAY be negative.
289
292
290
-
A Classic Bucket MUST count the number of measured values less than or equal to its threshold, including measured values that are also counted in lower buckets. This allow monitoring systems to drop any non-+Inf bucket for performance/anti-denial-of-service reasons in a way that loses granularity but is still a valid Histogram.
293
+
// CONSISTENCY +-Inf?
294
+
A Classic Bucket MUST count the number of measured values less than or equal to its threshold, including measured values that are also counted in lower buckets. This allows monitoring systems to drop any non-+Inf bucket for performance or anti-denial-of-service reasons in a way that loses granularity but is still a valid Histogram.
291
295
292
-
As an example, for a metric representing request latency in seconds with Classic Buckets and thresholds 1, 2, 3, and +Inf, it follows that value_1 <= value_2 <= value_3 <= value_+Inf. If ten requests took 1 second each, the values of the 1, 2, 3, and +Inf buckets will be all equal to 10.
296
+
As an example, for a metric representing request latency in seconds with Classic Buckets and thresholds 1, 2, 3, and +Inf, it follows that value_1 <= value_2 <= value_3 <= value_+Inf. If ten requests took one second each, the values of the 1, 2, 3, and +Inf buckets will be all equal to 10.
293
297
294
298
Histogram Samples with Classic Buckets MUST have one Classic Bucket with a +Inf threshold. The +Inf bucket counts all measurements. The Count value MUST be equal to the value of the +Inf bucket.
295
299
@@ -299,16 +303,13 @@ If the NaN value is allowed, it MUST be counted in the +Inf bucket, and MUST NOT
299
303
300
304
##### Native Buckets
301
305
302
-
Histogram Samples with Native Buckets MUST have a Schema value. The Schema MUST be an 8bit signed integer between -4 and 8 (inclusive), these are called Standard (exponential) schemas.
306
+
Histogram Samples with Native Buckets MUST have a Schema value. The Schema MUST be an 8-bit signed integer between -4 and 8 (inclusive), these are called Standard (exponential) schemas.
303
307
304
-
Schema values outside the -4 to 8 range are reserved for future use and MUST NOT be used. In particular:
308
+
Schema values outside the -4 to 8 range are reserved for future use and MUST NOT be used.
305
309
306
-
* Schema values between -9 to -5 and 9 to 52 are reserved for use as Standard (exponential) Schemas.
307
-
* Schema value equal to -53 is reserved for use for Custom Buckets Schema.
310
+
For any Standard Schema `n`, the Histogram Sample MAY contain positive and/or negative Native Buckets and MUST contain a zero Native Bucket. Empty positive or negative Native Buckets SHOULD NOT be present.
308
311
309
-
For any Standard Schema n, the Histogram Sample MAY contain positive and/or negative Native Buckets and MUST contain a zero Native Bucket. Empty positive or negative Native Buckets SHOULD NOT be present.
310
-
311
-
In case of Standard Schemas, the boundaries of a positive or negative Native Bucket with index i MUST be calculated as follows (using Python syntax):
312
+
In case of Standard Schemas, the boundaries of a positive or negative Native Bucket with index `i` MUST be calculated as follows (using Python syntax):
312
313
313
314
The upper inclusive limit of a positive Native Bucket: `(2**2**-n)**i`
314
315
@@ -318,17 +319,18 @@ The lower inclusive limit of a negative Native Bucket: `-((2**2**-n)**i)`
318
319
319
320
The upper exclusive limit of a negative Native Bucket: `-((2**2**-n)**(i-1))`
320
321
321
-
i is an integer number that MAY be negative.
322
+
`i` is an integer number that MAY be negative.
322
323
323
-
There are exceptions to the rules above concerning the largest and smallest finite values representable as a float64 (called MaxFloat64 and MinFloat64 in the following) and the positive and negative infinity values (+Inf and -Inf):
324
+
There are exceptions to the rules above concerning the largest and smallest finite values representable as a float64 (called MaxFloat64 and MinFloat64) and the positive and negative infinity values (+Inf and -Inf):
324
325
325
326
The positive Native Bucket that contains MaxFloat64 (according to the boundary formulas above) has an upper inclusive limit of MaxFloat64 (rather than the limit calculated by the formulas above, which would overflow float64).
326
327
327
-
The next positive Native Bucket (index i+1 relative to the bucket from the previous item) has a lower exclusive limit of MaxFloat64 and an upper inclusive limit of +Inf. (It could be called a positive Native overflow Bucket.)
328
+
The next positive Native Bucket (index `i+1` relative to the bucket from the previous item) has a lower exclusive limit of MaxFloat64 and an upper inclusive limit of +Inf. (It could be called a positive Native overflow Bucket.)
328
329
329
330
The negative Native Bucket that contains MinFloat64 (according to the boundary formulas above) has a lower inclusive limit of MinFloat64 (rather than the limit calculated by the formulas above, which would underflow float64).
330
331
331
-
The next negative Native Bucket (index i+1 relative to the bucket from the previous item) has an upper exclusive limit of MinFloat64 and an lower inclusive limit of -Inf. (It could be called a negative Native overflow Bucket.)
332
+
// MAYBE: kind of undeflow?
333
+
The next negative Native Bucket (index `i+1` relative to the bucket from the previous item) has an upper exclusive limit of MinFloat64 and a lower inclusive limit of -Inf. (It could be called a negative Native overflow Bucket.)
332
334
333
335
Native Buckets beyond the +Inf and -Inf buckets described above MUST NOT be used.
334
336
@@ -342,6 +344,7 @@ If the NaN value is allowed, it MUST NOT be counted in any Native Bucket, and MU
342
344
343
345
#### GaugeHistogram
344
346
347
+
// NOTE: To re-read
345
348
GaugeHistograms measure current distributions. Common examples are how long items have been waiting in a queue, or size of the requests in a queue.
346
349
347
350
A GaugeHistogram Sample MUST contain Gcount, Gsum values.
@@ -362,7 +365,7 @@ Float and negative bucket values are allowed to make it possible to expose resul
362
365
363
366
A GaugeHistogram SHOULD NOT include NaN measurements. If a GaugeHistogram includes NaN measurements, then NaN measurements MUST be counted in the Gcount and the Gsum MUST be NaN.
364
367
365
-
If a GaugeHistogram includes +Inf or -Inf measurement, then +Inf or -Inf MUST be counted in Gcount and MUST be added to the Gsum, potentially resulting in +Inf, -Inf or NaN in the Gsum, the later for example in case of adding +Inf to -Inf.
368
+
If a GaugeHistogram includes +Inf or -Inf measurement, then +Inf or -Inf MUST be counted in Gcount and MUST be added to the Gsum, potentially resulting in +Inf, -Inf or NaN in the Gsum, the latter for example in case of adding +Inf to -Inf.
366
369
367
370
If the GaugeHistogram Metric has Samples with Classic Buckets, the GaugeHistogram's Metric's LabelSet MUST NOT have a "le" label name, because in case the Samples are stored as classic histogram series with the `_bucket` suffix, then the "le" label in the GaugeHistogram will conflict with the "le" label generated from the bucket thresholds.
368
371
@@ -372,17 +375,21 @@ The exemplars for a GaugeHistogram follow all the same rules as for a Histogram.
372
375
373
376
#### Summary
374
377
375
-
Summaries also measure distributions of discrete events and MAY be used when Histograms are too expensive and/or an average event size is sufficient.
378
+
Summaries also measure distributions of discrete events and MAY be used when Histograms are too expensive and a small number of precomputed quantiles is sufficient.
376
379
377
-
They MAY also be used for backwards compatibility, because some existing instrumentation libraries expose precomputed quantiles and do not support Histograms. Precomputed quantiles SHOULD NOT be used, because quantiles are not aggregatable and the user often can not deduce what timeframe they cover.
380
+
// DISCUSSION: Main reason is hard to migrate
381
+
Summaries SHOULD NOT be used, because quantiles are not aggregatable and the user often can not deduce what timeframe they cover. They MAY be used for backwards compatibility, because some existing instrumentation libraries expose precomputed quantiles and do not support Histograms.
378
382
379
383
A Summary Sample MUST contain a Count, Sum and a set of quantiles.
380
384
381
385
Semantically, Count and Sum values are counters so MUST NOT be NaN or negative. Count MUST be an integer.
382
386
383
-
A Summary SHOULD have a Timestamp value called Start Timestamp. This can help ingestors discern between new metrics and long-running ones it did not see before. Start Timestamp MUST NOT relate to the collection period of quantile values.
387
+
// TODO: ST section/fix
388
+
A Summary SHOULD have a Timestamp value called Start Timestamp. This can help ingestors discern between new metrics and long-running ones it did not see before.
389
+
390
+
Start Timestamp MUST NOT be based on the collection period of quantile values.
384
391
385
-
Quantiles are a map from a quantile to a value. An example is a quantile 0.95 with value 0.2 in a metric called myapp_http_request_duration_seconds which means that the 95th percentile latency is 200ms over an unknown timeframe. If there are no events in the relevant timeframe, the value for a quantile MUST be NaN. A Quantile's Metric's LabelSet MUST NOT have "quantile" label name. Quantiles MUST be between 0 and 1 inclusive. Quantile values MUST NOT be negative. Quantile values SHOULD represent the recent values. Commonly this would be over the last 5-10 minutes.
392
+
Quantiles are a map from a quantile to a value. An example is a quantile 0.95 with value 0.2 in a metric called `myapp_http_request_duration_seconds` which means that the 95th percentile latency is 200ms over an unknown timeframe. If there are no events in the relevant timeframe, the value for a quantile MUST be NaN. A Quantile's Metric's LabelSet MUST NOT have "quantile" label name. Quantiles MUST be between 0 and 1 inclusive. Quantile values MUST NOT be negative. Quantile values SHOULD represent the recent values. Commonly this would be over the last 5-10 minutes.
386
393
387
394
#### Unknown
388
395
@@ -394,25 +401,27 @@ A Sample in a metric with the Unknown Type MUST have a Number or CompositeValue
394
401
395
402
The OpenMetrics formats are Regular Chomsky Grammars, making writing quick and small parsers possible.
396
403
404
+
// MAYBE: Be clear on failure modes.
397
405
Partial or invalid expositions MUST be considered erroneous in their entirety.
398
406
399
407
> NOTE: Previous versions of [OpenMetrics](https://prometheus.io/docs/specs/om/open_metrics_spec/#protobuf-format) used to specify a [OpenMetric protobuf format](https://github.com/prometheus/OpenMetrics/blob/3bb328ab04d26b25ac548d851619f90d15090e5d/proto/openmetrics_data_model.proto). OpenMetrics 2.0 does not include the protobuf representation. For available formats, including the official [Prometheus protobuf wire format](https://prometheus.io/docs/instrumenting/exposition_formats/#protobuf-format), see [exposition formats documentation](https://prometheus.io/docs/instrumenting/exposition_formats).
400
408
401
409
### Protocol Negotiation
402
410
403
-
All ingestor implementations MUST be able to ingest data secured with TLS 1.2 or later. All exposers SHOULD be able to emit data secured with TLS 1.2 or later. ingestor implementations SHOULD be able to ingest data from HTTP without TLS. All implementations SHOULD use TLS to transmit data.
411
+
// MAYBE: Require encryption? 1.2 is safe?
412
+
All ingestor implementations MUST be able to ingest data secured with TLS 1.2 or later. All exposers SHOULD be able to emit data secured with TLS 1.2 or later. Ingestor implementations SHOULD be able to ingest data from HTTP without TLS. All implementations SHOULD use TLS to transmit data.
404
413
414
+
// TODO: Fix the sentence
405
415
Negotiation of what version of the OpenMetrics format to use is out-of-band. For example for pull-based exposition over HTTP standard HTTP content type negotiation is used, and MUST default to the oldest version of the standard (i.e. 1.0.0) if no newer version is requested.
406
416
417
+
// MAYBE: Exposer? Also fallback to text format?
407
418
Push-based negotiation is inherently more complex, as the exposer typically initiates the connection. Producers MUST use the oldest version of the standard (i.e. 1.0.0) unless requested otherwise by the ingestor.
408
419
409
420
### ABNF
410
421
411
422
ABNF as per RFC 5234
412
423
413
-
<!---
414
-
# EDITOR’S NOTE: Should we update to RFC 7405, in particular the case insensitive bits?
415
-
-->
424
+
// MAYBE: Should we update to RFC 7405, in particular the case insensitive bits?
416
425
417
426
"exposition" is the top level token of the ABNF.
418
427
@@ -701,6 +710,7 @@ It is also valid to have:
701
710
702
711
If the unit is known it SHOULD be provided.
703
712
713
+
// WHAT: value of the line???
704
714
The value of a UNIT or HELP line MAY be empty. This MUST be treated as if no metadata line for the MetricFamily existed.
705
715
706
716
Full example:
@@ -1133,6 +1143,7 @@ It is intended to transport snapshots of state at the time of data transmission
1133
1143
1134
1144
How ingestors discover which exposers exist, and vice-versa, is out of scope for and thus not defined in this standard.
1135
1145
1146
+
// MINOR: on top of https://github.com/prometheus/docs/pull/2905/changes#r2963439248 (UTF8 and MetricName == MF name)
1136
1147
### Extensions and Improvements
1137
1148
1138
1149
This first version of OpenMetrics is based upon well established and de facto standard Prometheus text format 0.0.4, deliberately without adding major syntactic or semantic extensions, or optimisations on top of it. For example no attempt has been made to make the text representation of Histogram buckets more compact, relying on compression in the underlying stack to deal with their repetitive nature.
0 commit comments