Metrics SDK

MeterProvider

A MeterProvider MUST provide a way to allow a Resource to be specified. If a Resource is specified, it SHOULD be associated with all the metrics produced by any Meter from the MeterProvider. The tracing SDK specification has provided some suggestions regarding how to implement this efficiently.

Meter Creation

New Meter instances are always created through a MeterProvider (see API). The name, version (optional), and schema_url (optional) arguments supplied to the MeterProvider MUST be used to create an InstrumentationLibrary instance which is stored on the created Meter.

Configuration (i.e., MetricExporters, MetricReaders and Views) MUST be managed solely by the MeterProvider and the SDK MUST provide a way to configure all options that are implemented by the SDK. This MAY be done at the time of MeterProvider creation if appropriate.

The MeterProvider MAY provide methods to update the configuration. If configuration is updated (e.g., adding a MetricReader), the updated configuration MUST also apply to all already returned Meters (i.e. it MUST NOT matter whether a Meter was obtained from the MeterProvider before or after the configuration change). Note: Implementation-wise, this could mean that Meter instances have a reference to their MeterProvider and access configuration only via this reference.

Shutdown

This method provides a way for provider to do any cleanup required.

Shutdown MUST be called only once for each MeterProvider instance. After the call to Shutdown, subsequent attempts to get a Meter are not allowed. SDKs SHOULD return a valid no-op Meter for these calls, if possible.

Shutdown SHOULD provide a way to let the caller know whether it succeeded, failed or timed out.

Shutdown SHOULD complete or abort within some timeout. Shutdown MAY be implemented as a blocking API or an asynchronous API which notifies the caller via a callback or an event. OpenTelemetry SDK authors MAY decide if they want to make the shutdown timeout configurable.

Shutdown MUST be implemented at least by invoking Shutdown on all registered MetricReader and MetricExporter instances.

ForceFlush

This method provides a way for provider to notify the registered MetricReader and MetricExporter instances, so they can do as much as they could to consume or send the metrics. Note: unlike Push Metric Exporter which can send data on its own schedule, Pull Metric Exporter can only send the data when it is being asked by the scraper, so ForceFlush would not make much sense.

ForceFlush SHOULD provide a way to let the caller know whether it succeeded, failed or timed out. ForceFlush SHOULD return some ERROR status if there is an error condition; and if there is no error condition, it should return some NO ERROR status, language implementations MAY decide how to model ERROR and NO ERROR.

ForceFlush SHOULD complete or abort within some timeout. ForceFlush MAY be implemented as a blocking API or an asynchronous API which notifies the caller via a callback or an event. OpenTelemetry SDK authors MAY decide if they want to make the flush timeout configurable.

ForceFlush MUST invoke ForceFlush on all registered MetricReader and Push Metric Exporter instances.

View

A View provides SDK users with the flexibility to customize the metrics that are output by the SDK. Here are some examples when a View might be needed:

Customize which Instruments are to be processed/ignored. For example, an instrumented library can provide both temperature and humidity, but the application developer might only want temperature.
Customize the aggregation - if the default aggregation associated with the Instrument does not meet the needs of the user. For example, an HTTP client library might expose HTTP client request duration as Histogram by default, but the application developer might only want the total count of outgoing requests.
Customize which attribute(s) are to be reported as metrics dimension(s). For example, an HTTP server library might expose HTTP verb (e.g. GET, POST) and HTTP status code (e.g. 200, 301, 404). The application developer might only care about HTTP status code (e.g. reporting the total count of HTTP requests for each HTTP status code). There could also be extreme scenarios in which the application developer does not need any dimension (e.g. just get the total count of all incoming requests).
Add additional dimension(s) from the Context. For example, a Baggage value might be available indicating whether an HTTP request is coming from a bot/crawler or not. The application developer might want this to be converted to a dimension for HTTP server metrics (e.g. the request/second from bots vs. real users).

The SDK MUST provide the means to register Views with a MeterProvider. Here are the inputs:

The Instrument selection criteria (required), which covers:
- The type of the Instrument(s) (optional).
- The name of the Instrument(s), with wildcard support (optional).
- The name of the Meter (optional).
- The version of the Meter (optional).
- The schema_url of the Meter (optional).
- OpenTelemetry SDK authors MAY choose to support more criteria. For example, a strong typed language MAY support point type (e.g. allow the users to select Instruments based on whether the underlying type is integer or double).
- The criteria SHOULD be treated as additive, which means the Instrument has to meet all the provided criteria. For example, if the criteria are instrument name == “Foobar” and instrument type is Histogram, it will be treated as (instrument name == “Foobar”) AND (instrument type is Histogram).
- If none the optional criteria is provided, the SDK SHOULD treat it as an error. It is recommended that the SDK implementations fail fast. Please refer to Error handling in OpenTelemetry for the general guidance.
The name of the View (optional). If not provided, the Instrument name would be used by default. This will be used as the name of the metrics stream.
The configuration for the resulting metrics stream:
- The description. If not provided, the Instrument description would be used by default.
- A list of attribute keys (optional). If provided, the attributes that are not in the list will be ignored. If not provided, all the attribute keys will be used by default (TODO: once the Hint API is available, the default behavior should respect the Hint if it is available).
- The extra dimensions which come from Baggage/Context (optional). If not provided, no extra dimension will be used. Please note that this only applies to synchronous Instruments.
- The aggregation (optional) to be used. If not provided, the SDK SHOULD apply a default aggregation. If the aggregation has temporality, the SDK SHOULD use the temporality override rules to determine the aggregation temporality.
- The exemplar_reservoir (optional) to use for storing exemplars. This should be a factory or callback similar to aggregation which allows different reservoirs to be chosen by the aggregation.

The SDK SHOULD use the following logic to determine how to process Measurements made with an Instrument:

Determine the MeterProvider which “owns” the Instrument.
If the MeterProvider has no View registered, take the Instrument and apply the default configuration.
If the MeterProvider has one or more View(s) registered:
- For each View, if the Instrument could match the instrument selection criteria:
  - Try to apply the View configuration. If there is an error (e.g. the View asks for extra dimensions from the Baggage, but the Instrument is asynchronous which doesn’t have Context) or a conflict (e.g. the View requires to export the metrics using a certain name, but the name is already used by another View), provide a way to let the user know (e.g. expose self-diagnostics logs).
- If the Instrument could not match with any of the registered View(s), the SDK SHOULD provide a default behavior. The SDK SHOULD also provide a way for the user to turn off the default behavior via MeterProvider (which means the Instrument will be ignored when there is no match). Individual implementations can decide what the default behavior is, and how to turn the default behavior off.
END.

Here are some examples:

# Python
'''
+------------------+
| MeterProvider    |
|   Meter A        |
|     Counter X    |
|     Histogram Y  |
|   Meter B        |
|     Gauge Z      |
+------------------+
'''

# metrics from X and Y (reported as Foo and Bar) will be exported
meter_provider
    .add_view("X")
    .add_view("Foo", instrument_name="Y")
    .add_view(
        "Bar",
        instrument_name="Y",
        aggregation=HistogramAggregation(buckets=[5.0, 10.0, 25.0, 50.0, 100.0]))
    .add_metric_reader(PeriodicExportingMetricReader(ConsoleExporter()))

# all the metrics will be exported using the default configuration
meter_provider.add_metric_reader(PeriodicExportingMetricReader(ConsoleExporter()))

# all the metrics will be exported using the default configuration
meter_provider
    .add_view("*") # a wildcard view that matches everything
    .add_metric_reader(PeriodicExportingMetricReader(ConsoleExporter()))

# Counter X will be exported as cumulative sum
meter_provider
    .add_view("X", aggregation=SumAggregation(CUMULATIVE))
    .add_metric_reader(PeriodicExportingMetricReader(ConsoleExporter()))

# Counter X will be exported as delta sum
# Histogram Y and Gauge Z will be exported with 2 dimensions (a and b)
meter_provider
    .add_view("X", aggregation=SumAggregation(DELTA))
    .add_view("*", attribute_keys=["a", "b"])
    .add_metric_reader(PeriodicExportingMetricReader(ConsoleExporter()))

Aggregation

An Aggregation, as configured via the View, informs the SDK on the ways and means to compute Aggregated Metrics from incoming Instrument Measurements.

An Aggregation specifies an operation (i.e. decomposable aggregate function like Sum, Histogram, Min, Max, Count) and optional configuration parameter overrides. The operation’s default configuration parameter values will be used unless overridden by optional configuration parameter overrides.

Note: Implementors MAY choose the best idiomatic practice for their language to represent the semantic of an Aggregation and optional configuration parameters.

e.g. The View specifies an Aggregation by string name (i.e. “ExplicitBucketHistogram”).

# Use Histogram with custom boundaries
meter_provider
  .add_view(
    "X", 
    aggregation="ExplicitBucketHistogram", 
    aggregation_params={"Boundaries": [0, 10, 100]}
    )

e.g. The View specifies an Aggregation by class/type instance.

// Use Histogram with custom boundaries
meterProviderBuilder
  .AddView(
    instrumentName: "X",
    aggregation: new ExplicitBucketHistogramAggregation(
      boundaries: new double[] { 0.0, 10.0, 100.0 }
    )
  );

TODO: after we release the initial Stable version of Metrics SDK specification, we will explore how to allow configuring custom ExemplarReservoirs with the View API.

The SDK MUST provide the following Aggregation to support the Metric Points in the Metrics Data Model.

None Aggregation

The None Aggregation informs the SDK to ignore/drop all Instrument Measurements for this Aggregation.