6.2. Working with Metrics #
This section describes the steps required to manage metrics.
6.2.1. General Setup #
6.2.1.1. Adding and Configuring the postgrespro Receiver #
To collect metrics from the database instance, add the postgrespro receiver to the receivers section and specify its configuration.
Required configuration:
Specify the database instance connection parameters.
Specify the list of the plugins for data collection.
Additional configuration:
Collection parameters: parallelism, delay, interval.
receivers:
postgrespro:
max_threads: 3
collection_interval: 60s
initial_delay: 1s
transport: tcp
endpoint: localhost:5432
database: postgres
username: postgres
password: ${env:POSTGRESQL_PASSWORD}
metrics: null
plugins:
activity:
enabled: true
bgwriter:
enabled: true
locks:
enabled: true
version:
enabled: true
wal:
enabled: true
cache:
enabled: true
Some plugins have additional configuration parameters. For example, the plugins for metrics collection from DBMS objects (tablespaces, databases, tables, indexes) can be configured in such a way that the collection will only take place in a specified number of objects. This allows controlling the load on the database instance and the amount of data sent through the pipeline to an exporter. The detailed description of configuration parameters for each plugin can be found in the /usr/share/doc/pgpro-otel-collector/examples directory.
The receiver can also use Unix sockets when the endpoint is defined as shown below.
receivers:
postgrespro:
...
transport: unix
endpoint: /tmp:5432 # Or 'tmp:5432'
...
6.2.1.2. Adding and Configuring the hostmetrics Receiver #
The hostmetrics receiver is an open-source component of the OpenTelemetry Collector and is used for collecting metrics from the operating system. For detailed information about this receiver, refer to the OpenTelemetry documentation.
To configure the hostmetrics receiver, it is sufficient to list the plugins (scrapers) for data collection. Collection parameters are also available: delay and interval.
Some plugins also have additional configuration parameters.
receivers:
hostmetrics:
collection_interval: 60s
initial_delay: 1s
scrapers:
cpu:
metrics:
system.cpu.utilization:
enabled: true
disk: null
load: null
memory: null
network: null
6.2.1.3. Adding and Configuring the SQL Query Receiver #
The sqlquery receiver is an open-source component of the OpenTelemetry Collector for gathering metrics and/or logs from custom SQL queries.
Warning
The sqlquery receiver is currently experimental and is not recommended for production use.
To set up the sqlquery receiver for collecting metrics, follow the example procedure below.
Create the
sqlquery.ymlconfiguration file.In the created configuration file, specify database connection parameters in the
receivers.sqlquerysection:receivers: sqlquery: driver: postgres host: localhost port: 5432 database: postgres username: postgres password: ${env:POSTGRESQL_PASSWORD} # Additional driver-specific connection parameters additional_params: application_name: pgpro-otel-collector sslmode: disable # The time interval between query executions. Default: 10s collection_interval: 60s # Defines setup for the component's own telemetry telemetry: logs: # If true, each executed query is logged at debug level query: false # The maximum number of open connections to the Postgres Pro server. Default: 0 (unlimited) max_open_conn: 5The
passwordparameter supports environment variable substitution, as shown in the example. Special characters in the credentials are automatically URL-encoded to ensure proper connection string formatting.Alternatively, use the
datasourceparameter to provide a complete connection string:datasource: "host=localhost port=5432 user=postgres password=postgres application_name=pgpro-otel-collector sslmode=disable"
List queries to collect metrics. Each query consists of an SQL statement and a
metricssection. There may be severalmetricssections, but at least one such section is required. Each metric in the configuration produces one OpenTelemetry metric per row returned from the SQL query.receivers: sqlquery: ... queries: - sql: >- SELECT p.id, p.name, p.price, sum(s.units) AS sold FROM products p LEFT JOIN sales s ON (p.id = product_id) GROUP BY p.id, p.name, p.price metrics: # Gauge metric example # The name assigned to the OpenTelemetry metric - metric_name: postgresql.products.price value_column: price data_type: gauge value_type: double description: Price of the product unit: rub static_attributes: database: postgres attribute_columns: ["name"] # Sum metric example - metric_name: postgresql.products.sold value_column: sold data_type: sum value_type: int description: Total count of sold products # Whether a cumulative sum value is monotonically increasing (i.e. never rolls over or resets) # Default = false monotonic: true aggregation: cumulative static_attributes: database: postgres attribute_columns: ["name"]This example assumes the following database schema:
CREATE TABLE products (id INTEGER, name TEXT, price NUMERIC); INSERT INTO products VALUES (1, 'Cheese', 9.99), (2, 'Bread', 1.99), (3, 'Milk', 2.99); CREATE TABLE sales (id BIGINT, product_id BIGINT, units INT, sold_at TIMESTAMP DEFAULT now()); INSERT INTO sales VALUES (1, 1, 10), (2, 2, 20), (2, 2, 50), (3, 3, 30);
The query creates two metrics: a gauge for current product prices and a cumulative sum for total units sold per product.
Use
data_typeto specify the metric type. Set it togaugefor metrics representing a current value, orsumfor metrics representing an aggregated total. When usingsum, set theaggregationparameter to eithercumulative(default) ordelta.Note
Avoid queries that produce NULL values. If a query returns NULL in a column referenced in the configuration, errors will be logged, but the receiver will continue operating.
Configure exporters, processors, and the service pipeline:
... exporters: prometheus: endpoint: :8889 send_timestamps: true otlphttp/sqlquery/metrics: compression: '' endpoint: https://metrics.example.org:8080 headers: X-Ppem-Source-Agent-Name: local X-Ppem-Source-Instance-Port: '5432' processors: batch/sqlquery: send_batch_size: 2048 timeout: 10s service: # Telemetry for the collector itself telemetry: logs: # Sets the minimum enabled logging level # Values: debug, info, warn, error # Default = info level: info pipelines: metrics/sqlquery: receivers: [ sqlquery ] processors: [ batch/sqlquery ] exporters: [ prometheus, otlphttp/sqlquery/metrics ]Start pgpro-otel-collector with the created configuration file:
build/pgpro-otel-collector/pgpro-otel-collector --config configs/sqlquery.yml
For the full list of the sqlquery configuration parameters, refer to the OpenTelemetry documentation.
6.2.1.4. Adding and Configuring the metricstransform Processor #
The metricstransform processor is an open-source component of the OpenTelemetry Collector for renaming metrics and managing labels, including scaling and aggregation operations. For more details, refer to the OpenTelemetry documentation.
The example setup of the metricstransform processor is shown in the procedure below.
Create the
metrics_transform.ymlconfiguration file.Add the configuration sections that match your needs from the examples below:
Rename a metric — for example,
postgresql.databases.size_bytestopostgresql.db.size_bytes:processors: metricstransform/rename: transforms: - include: postgresql.databases.size_bytes # Default = strict match_type: strict action: update new_name: postgresql.db.size_bytesThe
actionparameter controls the transformation type:updatemodifies existing metrics,insertcreates cloned metrics, andcombinemerges multiple metrics into a new one.Use regular expressions to rename multiple metrics at once:
metricstransform/rename_by_regexp: transforms: - include: ^postgresql\.databases\.(.*)$$ match_type: regexp action: update new_name: postgresql.db.$${1}Create a duplicate of the metric with a new name:
metricstransform/create_new: transforms: - include: postgresql.databases.size_bytes match_type: strict action: insert new_name: postgresql.db.size_bytesAdd a new label to a metric:
metricstransform/add_label: transforms: - include: postgresql.databases.size_bytes action: update operations: - action: add_label new_label: new_label new_value: "value 1"Rename an existing label:
metricstransform/rename_label: transforms: - include: postgresql.databases.size_bytes action: update operations: - action: update_label label: database new_label: dbRename multiple labels using a regular expression:
metricstransform/rename_label_multiple: transforms: - include: ^postgresql\.databases\.(.*)$$ match_type: regexp action: update operations: - action: update_label label: database new_label: dbRename a specific label value:
metricstransform/rename_label_value: transforms: - include: postgresql.databases.size_bytes action: update operations: - action: update_label label: database value_actions: - value: postgres new_value: defaultDelete data points that have a certain label value:
metricstransform/delete_by_label_value: transforms: - include: postgresql.databases.size_bytes action: update operations: - action: delete_label_value label: database # Specifies the label value whose data points will be removed label_value: db11Convert data type from
doubletointand vice versa:metricstransform/convert_data_type: transforms: - include: postgresql.databases.size_bytes action: update operations: - action: toggle_scalar_data_typeAggregate data points that have the labels excluded in
label_set:metricstransform/aggregate_labels: transforms: - include: postgresql.activity.connections action: update operations: - action: aggregate_labels # Contains a list of labels that will remain after aggregation label_set: #- database - user #- backend_process_state # Defines how combined data points will be aggregated # Possible values: sum, mean, min, max, count, median aggregation_type: sumNote that only the sum aggregation function is supported for histogram and exponential histogram data types.
Aggregate data points that have a specific label value:
metricstransform/aggregate_label_values: transforms: - include: postgresql.activity.connections action: update operations: - action: aggregate_label_values label: database # Contains a list of label values that will be aggregated aggregated_values: [ db11, db12, db13 ] new_value: all_db1 aggregation_type: sumNote that only the sum aggregation function is supported for histogram and exponential histogram data types.
Combine several related metrics into one. For example, combine multiple
tuples_*metrics intotuples_totalwith atypelabel:metricstransform/combine_metrics: transforms: - include: ^postgresql\.databases\.tuples_(?P<type>.*)$$ match_type: regexp action: combine new_name: postgresql.databases.tuples_total submatch_case: lowerThe
submatch_caseparameter controls the case of label values extracted from regular expression submatches during combine operations. Leave empty to preserve the original case. Possible values:lowerorupper.Group metrics from a single resource and report them as multiple resource metrics:
metricstransform/group_metrics: transforms: - include: ^postgresql.databases.size_bytes$$ match_type: regexp action: group group_resource_labels: {"resource.type": "default", "source": "default"} - include: ^postgresql.databases.orphaned_files_size_bytes$$ match_type: regexp action: group group_resource_labels: {"resource.type": "orphaned", "source": "orphaned"}Create a new metric filtered by a label value (experimental feature):
metricstransform/experimental_create_by_label: transforms: - include: postgresql.databases.size_bytes match_type: regexp experimental_match_labels: {"database": "db1.*"} action: insert new_name: postgresql.db1.sizeIf
experimental_match_labelsis specified, transformations apply only to metrics that match the specified label values. This works with bothstrictandregexpmatch_type.Scale metric values from bytes to bits (experimental feature):
metricstransform/experimental_scale_value: transforms: - include: postgresql.databases.size_bytes match_type: strict action: insert new_name: postgresql.databases.size_bits operations: - action: experimental_scale_value experimental_scale: 8The
experimental_scaleparameter defines the scalar to apply to metric values. Note that scaling exponential histograms inherently involves some loss of accuracy.
Configure exporters and the service pipeline in accordance with the chosen sections:
exporters: prometheus: endpoint: :8889 send_timestamps: true # translation_strategy: UnderscoreEscapingWithoutSuffixes # If true, all the resource attributes will be converted to metric labels by default # resource_to_telemetry_conversion: # enabled: true service: pipelines: metrics: receivers: - postgrespro processors: - metricstransform/rename # - metricstransform/rename_by_regexp # - metricstransform/create_new # - metricstransform/add_label # - metricstransform/rename_label # - metricstransform/rename_label_multiple # - metricstransform/rename_label_value # - metricstransform/delete_by_label_value # - metricstransform/convert_data_type # - metricstransform/aggregate_labels # - metricstransform/aggregate_label_values # - metricstransform/combine_metrics # - metricstransform/group_metrics # - metricstransform/experimental_create_by_label # - metricstransform/experimental_scale_value exporters: - prometheusStart pgpro-otel-collector with both the main configuration file and
metrics_transform.yml:build/pgpro-otel-collector/pgpro-otel-collector --config configs/basic.yml --config configs/metrics_transform.yml
For a complete list of configuration parameters, refer to the OpenTelemetry documentation.
6.2.1.5. Adding and Configuring the prometheus Exporter #
The prometheus exporter is an open-source component of the OpenTelemetry Collector. For detailed information, refer to the OpenTelemetry documentation.
prometheus is the easiest to use — it does not require the external component configuration and can be enabled by default. To set it up, it is sufficient to specify the address to listen for incoming requests:
exporters:
prometheus:
endpoint: "1.2.3.4:8889"
send_timestamps: true
6.2.1.6. Adding and Configuring the otlphttp Exporter #
The otlphttp exporter is an open-source component of the OpenTelemetry Collector and is used for exporting collected logs to an OTLP-compatible storage or monitoring system that has to be predeployed and accessible. For more details, refer to the OpenTelemetry documentation.
To configure the otlphttp exporter, it is sufficient to specify the address of the target system where data should be sent:
exporters:
otlphttp:
endpoint: https://otlp.example.org
6.2.1.7. Adding and Configuring the kafka Exporter #
The kafka exporter is an open-source component of the OpenTelemetry Collector for sending metrics and logs to Apache Kafka. For more details, refer to the OpenTelemetry documentation.
Below is the example of setting it up for sending metrics.
receivers:
postgrespro:
max_threads: 3
collection_interval: 60s
initial_delay: 1s
transport: tcp
endpoint: localhost:5432
database: postgres
username: postgres
password: ${env:POSTGRESQL_PASSWORD}
metrics: null
plugins:
activity:
enabled: true
bgwriter:
enabled: true
locks:
enabled: true
version:
enabled: true
wal:
enabled: true
cache:
enabled: true
exporters:
kafka:
brokers:
- localhost:9092
protocol_version: 2.1.0
client_id: pgpro-otel-collector
metrics:
topic: otlp_metrics
encoding: otlp_json # proto supported
include_metadata_keys:
- service.name
- service.instance.id
tls:
insecure: true
timeout: 30s
producer:
max_message_bytes: 1000000
required_acks: 1
compression: none # gzip, snappy, lz4, and zstd;
processors:
batch/kafka:
send_batch_size: 1024
timeout: 1s
resource:
attributes:
- key: service.name
action: upsert
value: postgresql
- key: service.instance.id
action: upsert
value: address-of-postgres-instance:5432
service:
pipelines:
metrics/kafka:
receivers: [ postgrespro ]
processors: [ batch/kafka,resource ]
exporters: [ kafka ]
6.2.1.8. Setting up a Pipeline #
Once receivers and exporters are added and configured, they need to be combined into a pipeline. The pipeline is configured in the service section. The pipeline contents depend altogether on the previously added components (there is no default configuration).
The example below shows how to set up a pipeline for metric management. The data is collected by the postgrespro and hostmetrics receivers, processed by the batch processor and exported by the prometheus and otlphttp exporters.
Thus, all the components used in the pipeline should also be added in the configuration file and set up.
service:
extensions: []
pipelines:
metrics:
receivers:
- postgrespro
- hostmetrics
processors:
- batch
exporters:
- prometheus
- otlphttp
6.2.2. Use Cases #
6.2.2.1. Collecting Metrics from a BiHA Cluster #
pgpro-otel-collector can be configured to collect metrics from BiHA clusters using the biha plugin. The example procedure for such setup is as follows:
Create a configuration file called
biha.ymlwith the following content:receivers: postgrespro/biha: transport: tcp endpoint: &endpoint localhost:5432 database: biha_db username: biha_replication_user password: ${env:POSTGRESQL_PASSWORD} collection_interval: 60s initial_delay: 1s max_threads: 3 plugins: biha: enabled: true cluster_name: cluster_name exporters: prometheus/biha: endpoint: :8889 send_timestamps: true # Defines how long metrics remain exposed without updates # Since some BiHA metrics include 'biha_state' in their labels, # it is best to set this parameter equal to the collection_interval, # so that outdated states do not appear on the Prometheus page. metric_expiration: 60s processors: batch/metrics: send_batch_size: 8192 timeout: 10s memory_limiter/metrics: check_interval: 1s limit_mib: 2048 service: pipelines: metrics/biha: receivers: - postgrespro/biha processors: - memory_limiter/metrics - batch/metrics exporters: - prometheus/bihaStart pgpro-otel-collector with the created configuration file:
build/pgpro-otel-collector/pgpro-otel-collector --config configs/biha.yml
To check which metrics are included in the biha plugin, refer to Section 8.18.
Note
Plugins with special access privileges must be configured separately.
If you need to include plugins that require special access privileges, add all of them in a separate postgrespro section with their own usernames. For example:
receivers:
postgrespro/biha:
username: biha_replication_user
...
plugins:
biha:
enabled: true
...
postgrespro/otel:
username: otel
...
plugins:
databases:
enabled: true
For more details on the access privileges, refer to Section 3.1.
Note
By default, the postgres database is not copied to a node in the referee or referee_with_wal mode (unless the --referee-with-postgres-db option is used).
When collecting database-level, table-level, index-level, or function-level statistics from these nodes, you must explicitly exclude the postgres database using the acl.databases.deny section, as shown below. This exclusion is not required when collecting per-instance statistics. For more information about allowlists and denylists, refer to Section 6.6.5.
acl:
databases:
deny:
- name: postgres