9.3. The prometheus Exporter Issues #
The prometheus exporter is not designed to handle and transmit large volumes of metrics.
Consider a scenario with a Postgres Pro database containing 10,000 tables and 10,000 indexes, using the following extended plugin set:
hostmetrics: cpu (utilization), disk, filesystem, load, memory, network, paging, processespostgrespro: activity, archiver, bgwriter, bloat_indexes, bloat_tables, cache, databases, functions, indexes, io, locks, replication, replication_slots, tables, tablespaces, version, wal
The expected load in this scenario looks as follows:
collector RAM usage — at least 3 GiB
time to fully load the metrics page — 8-10 seconds
CPU load — 30-50% in conducted tests (1 core)
If the server has less than 3 GiB of RAM available, the collector may be terminated by the OOM killer, excluding other processes.
The collector generates over 390,000 metric records in this configuration.
Use the table below to estimate the number of metrics.
| Plugin Name | Number of Metrics Generated per Object |
|---|---|
| tables | 31 per table |
| indexes | 6 per index (+1 if invalid) |
| bloat_tables | 1 per table |
| bloat_indexes | 1 per index |
Thus, for 100,000 tables and 100,000 indexes, the number of metrics would be at least 3,900,000 for the plugins listed above.
When transmitting hundreds of thousands of metrics through the prometheus exporter using pull model, you may encounter the following error in pgpro-otel-collector logs:
{
"level": "error",
"ts": "2025-09-05T17:40:25.575+0300",
"msg": "error encoding and sending metric family: write tcp 127.0.0.1:8889->127.0.0.1:44930: write: broken pipe\n",
"resource": {
"service.instance.id": "62cc1e9c-a53f-423e-9c6f-41b1f6a0872a",
"service.name": "pgpro-otel-collector",
"service.version": "v0.4.0"
},
"otelcol.component.id": "prometheus",
"otelcol.component.kind": "exporter",
"otelcol.signal": "metrics"
}
This error indicates that metrics cannot be downloaded by prometheus within its allocated timeout period. Use the following workarounds to fix the problem:
Increase Timeout
In the
prometheusconfiguration, specify a larger timeout than the default value:global: scrape_interval: 15s # Default = 1m scrape_timeout: 15s # Increase timeout globally or in specific scrape_config (default = 10s)
Reduce Metrics Volume
To reduce the overall volume of transmitted metrics, configure collection from specific objects:
receivers: postgrespro: plugins: tables: enabled: true databases: - name: database_name schemas: - name: schema_name tables: - name: table_name indexes: enabled: true databases: - name: database_name schemas: - name: schema_name tables: - name: table_name indexes: - name: index_name bloat_tables: enabled: true fetcher: batch_size: 10000 collection_interval: 5m databases: - name: database_name schemas: - name: schema_name tables: - name: table_name bloat_indexes: enabled: true fetcher: batch_size: 10000 collection_interval: 5m databases: - name: database_name schemas: - name: schema_name tables: - name: table_name indexes: - name: index_nameUse Denylists
If the previous method requires specifying too many objects, use a denylist to exclude specific objects instead. For implementation examples, refer to Section 6.6.5.
Increase Resources
If all collected metrics are required, allocate more CPU resources to pgpro-otel-collector — for example, when the
/metricspage loads too slowly due to insufficient server resources.