1.2. Architecture #

1.2.1. Description of Components #

The Postgres Pro Enterprise Manager architecture comprises the components below.

Web Application #

Web application provides a graphical user interface accessible through a browser. The web application is installed on the same server as the manager. Users perform various actions using the browser, and the web application transforms these actions into requests and sends them to the manager. The manager processes requests and returns them to the web application. The web application transforms the response into various representations and then displays it in the browser.

Manager #

PPEM manager (hereinafter referred to as the manager) is a service that runs in the background. It accepts requests for DBMS infrastructure maintenance and has its own scheduler for performing regular service tasks. The manager also accepts requests from the web application and transforms them into operations according to the designed logic.

To perform operations, you need the following:

  • various service data, which is usually (but not necessarily) stored in the repository

  • executing instructions on the agent side

The manager reports the operation results to the web application upon completion.

Note

The operations performed by the manager can be synchronous or asynchronous. In case of synchronous operations, the manager is forced to wait until the operation completes to obtain the response. In case of asynchronous operations, the manager immediately receives a response that the operation has been queued for execution. In this case, the manager needs to periodically check the operation status, but, in most cases, the user receives a notification upon the operation completion.

Repository #

PPEM repository (hereinafter referred to as the repository) is a database on a dedicated DBMS instance with a set of schemas and tables where the manager stores service data necessary for operation. When the manager starts, it establishes a connection with the repository and reads and writes data from and to it during operations. Repository availability is critical, since the manager cannot continue operating if the repository is unavailable.

Agents #

PPEM agents (hereinafter referred to as agents) are services that run on managed DBMS servers. They receive control instructions from the manager and, depending on the instruction type, perform various actions both in the operating system and in the DBMS instance, for example:

  • running commands

  • creating directories and files

  • executing queries

During or after execution, the execution result is sent to the manager to be recorded in the repository and/or subsequently processed according to the operation logic.

Agents also have an internal service task scheduler that regularly performs background tasks and sends their results to the manager.

One agent is enough on a standalone server to serve one or more DBMS instances.

DBMS Instance #

DBMS instance is a PPEM-managed object, i.e., the Postgres Pro DBMS in various editions (Postgres Pro Standard, Postgres Pro Enterprise). Several DBMS instances can be combined into clusters, typically streaming replication clusters.

Only a single agent interacts with a single DBMS instance. Avoid scenarios where multiple agents interact with the same DBMS instance, as this can lead to confusion and ambiguous behavior. From the DBMS instance standpoint, the agent is a regular application software that connects to the instance via the SQL interface using a predetermined DBMS account.

External Services #

PPEM can use various external services to extend its features and capabilities. All these services are optional, and interaction with them is configured separately.

Note

PPEM does not include tools for administrative management of external services (e.g., resource and configuration control).

The following external services are supported:

  • LDAP directory is a directory of users and groups for authenticating users in PPEM. PPEM supports OpenLDAP and Microsoft Active Directory.

  • S3-compatible object storage is used by PPEM to store backups made with the pg_probackup utility.

  • OTLP-compatible metrics storage system is a metrics storage system that supports the push model with the ability to record metrics via the OTLP protocol, for example, Victoriametrics, and/or the pull model capable of fetching metrics via the HTTP protocol, for example, Prometheus. PPEM can use such a system to receive metrics that are written there by the pgpro-otel-collector. PPEM supports both Victoriametrics and Prometheus.

  • OTLP-compatible log storage system is a metrics storage system capable of receiving logs via the OTLP protocol. PPEM can use such a system to receive DBMS logs that are written there by pgpro-otel-collector. PPEM supports Elasticsearch.

1.2.2. High-Level Architecture #

The diagram below illustrates an example of a high-level architecture and interaction between components.

Figure 1.1. PPEM architecture


Where:

  • PPEM is a DBMS infrastructure management service comprising web application, manager, repository, and agents.

  • A dedicated DBMS instance is a PPEM-managed object that represents the PostgreSQL, Postgres Pro Standard, or Postgres Pro Enterprise DBMS.

  • The replication cluster is several DBMS instances combined into a cluster, typically a streaming replication cluster.

  • The user and group identification service is a user and group directory that can be used to authenticate users in PPEM.

  • The S3-compatible object storage is a system for storing backups made with pg_probackup.

  • The OTLP-compatible monitoring system is a monitoring system capable of receiving metrics via the OTLP protocol or fetching them via HTTP. PPEM can use this system to receive DBMS performance metrics.

The user and group identification service, S3-compatible object storage, and OTLP-compatible monitoring system are optional external services.

1.2.3. Manager and Agent Architecture #

The diagram below illustrates an example of an architecture and interaction between the manager and agent.

Figure 1.2. Manager and Agent Architecture


Where:

  • Users can work with PPEM both through the web application in a browser and through automation tools (IaC), which can interact with PPEM via the REST API.

  • In PPEM, the main graphical user interface is the web application. The web application is closely associated with the manager, from which it receives data and sends user control instructions. The manager provides an API for obtaining data and managing the DBMS infrastructure. The manager stores the intermediate state of the infrastructure in the repository database and interacts with the agent, which manages the DBMS instance.

  • The operating system is the working environment in which the DBMS instance and the agent are started. The agent interacts with the manager: sends environment data (OS and DBMS instance information) and receives control instructions.

1.2.4. Monitoring Architecture #

To provide monitoring functions related to metrics and logs (hereinafter telemetry), PPEM uses pgpro-otel-collector from Postgres Pro.

Working with telemetry can be organized in two ways:

  • The repository databases are used to store telemetry: metrics are stored in the separate monitoring schema, and logs are stored in the logs schema.

  • External data storages (in relation to PPEM) can be used to store telemetry.

In both cases pgpro-otel-collector is the required component. The collector gathers statistics and logs, generates telemetry data, and, depending on the selected mode, exports data to the manager or to an external storage system.

Note

Using the internal storage is not recommended in production environments, as it has limited scalability and potential performance issues if data volumes grow. Writing and reading large amounts of telemetry can negatively impact PPEM overall performance when accessing the repository database.

1.2.4.1. Internal Storage #

The diagram below illustrates an example of a monitoring architecture that uses an internal storage.

Figure 1.3. Monitoring Using Internal Storage


Where:

  • Users work with PPEM through the web application in a browser where they view graphs of metrics and logs.

  • In PPEM, metrics and logs are stored in the repository. The manager receives metrics and logs from the collector and stores them in repository databases. It also requests metrics and logs from the repository databases and returns them to the consumer via the REST API. The web application queries the manager for metrics and logs and then visualizes the received data.

  • In the operating system, pgpro-otel-collector connects to the DBMS instance to receive metrics, reads the DBMS log files, processes the received data, and sends it to the manager via the REST API. The agent is a standalone component that provides control functionality and does not collect metrics or logs.

1.2.4.2. External Storage #

The diagram below illustrates an example of a monitoring architecture that uses an external storage.

Figure 1.4. Monitoring Using External Storage


Where:

  • Users work with PPEM through the web application in a browser where they view graphs of metrics and logs.

  • In PPEM, the manager requests metrics and logs from external storages and returns them to the consumer via the REST API. The web application queries the manager for metrics and logs and then visualizes the received data.

  • In the operating system, pgpro-otel-collector connects to the DBMS instance to receive metrics, reads the DBMS log files, processes the received data, and sends it to external storages via the REST API. The agent is a standalone component that provides control functionality and does not collect metrics or logs.

  • External storages for metrics and logs:

    • A dedicated monitoring system can act as a metrics storage. Prometheus, Victoriametrics, and OTLP-compatible storages are supported.

    • A separate system with OTLP-compatible storage can serve as a log storage. Elasticsearch is supported.

1.2.5. Backup Architecture #

The diagram below illustrates an example of a backup architecture.

Figure 1.5. Backup Architecture


Where:

  • Users utilize backup and restore functionality via the web application.

  • In PPEM, the web application sends requests to the manager. The manager sends backup and restore requests to agents.

  • In the operating system, agents perform backup or restoration using the pg_probackup utility. A backup can be saved either to a local directory or to an S3-compatible object storage. The backup directory must be preconfigured to work with pg_probackup. In case of recovery, the backup can be obtained from either the local directory or from the object storage.

  • The S3-compatible object storage is an external service. Specific object storages are supported by means of pg_probackup, the support does not depend on PPEM.

1.2.6. Instance Data Collection and Post-Processing #

This section explains how agents collect instance data and send it to the manager for post-processing.

1.2.6.1. Instance Data Collection #

Instance data collection by agents includes the following steps:

  1. Connecting to the default instance database.

  2. Collecting global objects:

    • roles

    • tablespaces

    • databases

  3. Collecting local objects for each database:

    • extensions

    • schemas

    • tables

    • indexes

    • sequences

    • functions

    • languages

  4. Sending the collected objects to the manager.

  5. Sending the DELETE /instances/objects request to the manager with the specified instance_id attribute and data collection time to delete obsolete objects, for example, deleted objects.

  6. Sending the POST /instances/objects/post_processing request to the manager to start post-processing.

    If the manager already performs post-processing, for example, if it was not finished after the previous data collection cycle, 429 Too Many Requests is returned. In this case, the agent finishes the current data collection cycle and postpones post-processing to the next cycle.

Important Considerations #
  • Every instance can potentially contain hundreds of thousands of objects, so agents collect data in batches.

    You can specify the size of batches in the ppem-agent.yml agent configuration file using the collectors.instance_objects.batch_size: number_of_objects_in_each_batch; parameter.

  • Agents send extended requests for collecting the following table and index data:

    • For tables:

      • the table sizes per main, vm, and fsm layers, in bytes

      • the bloat size, in bytes

      • the TOAST size, in bytes

      • the number of tuples

      • the number of pages

      • the total index size, in bytes

      • the storage parameters

      • the table file path

    • For indexes:

      • the index size, in bytes

      • the bloat size, in bytes

      • the storage parameters

      • the index file path

    Extended requests can be resource-intensive and are sent once per 5 data collection cycles to avoid overloading the instance. You can specify how often extended requests are sent in the ppem-agent.yml agent configuration file using the following parameters:

    • collectors.instance_objects.extended.enabled: Specifies whether extended requests are sent.

      Possible values:

      • true

      • false

    • collectors.instance_objects.extended.interval: The time interval for sending extended requests.

      Alternatively, you can specify this time interval in the crontab format using the collectors.instance_objects.extended.schedule parameter. This parameter takes precedence over collectors.instance_objects.extended.interval.

  • For composite objects, i.e., databases, schemas, and tables, agents collect information on all their dependencies. Based on this information, the manager generates summary information after the data collection cycle is completed.

  • Agents automatically reconnect to the Postgres Pro DBMS instance once per 10 requests to avoid bloating the cache of the corresponding backend.

1.2.6.2. Post-Processing #

Post-processing by the manager includes the following steps:

  1. Updating the size_bytes field of the collected objects:

    • For tables, the value is calculated as:

      relation_size + visibility_map_size + free_space_map_size + TOAST_size

    • For indexes, the value equals relation_size.

    • For schemas and databases, the value is the total size of all their dependencies.

    Note

    The size of the pg_toast schema's tables is not considered since it is included in the size of database tables.

  2. Re-generating the following summary information for all composite objects:

    • For databases:

      • the total sum of table sizes, in bytes

      • the total sum of index sizes, in bytes

      • the number of tables

      • the number of indexes

      • the total bloat, in bytes

      • available_xid_total and available_xid_percent

    • For schemas:

      • the number of tables

      • the number of indexes

      • the table size and bloat

      • the index size and bloat

    • For tables:

      • the number of indexes

      • the total sum of index sizes

      • the total sum of bloat sizes

    The summary information is stored in the instance_objects table as a JSONB column. The structure of the summary information depends on the composite object type:

    database:
    tables_count: INT
    tables_all_size_bytes: BIGINT
    tables_all_bloat_size_bytes: BIGINT
    indexes_count: INT
    indexes_all_size_bytes: BIGINT
    indexes_all_bloat_size_bytes: BIGINT
    available_xid_total: BIGINT
    available_xid_percent: BIGINT
    
    schema:
    tables_count: INT
    tables_all_size_bytes: BIGINT
    tables_all_bloat_size_bytes: BIGINT
    indexes_count: INT
    indexes_all_size_bytes: BIGINT
    indexes_all_bloat_size_bytes: BIGINT
    
    table:
    indexes_count: INT
    indexes_all_size_bytes: BIGINT
    indexes_all_bloat_size_bytes: BIGINT