F.70. proxima — a combination of a proxy and a connection pooler #
proxima is a Postgres Pro Enterprise extension that combines functionality of a proxy server and a connection pooler.
F.70.1. Overview #
The proxima extension can be used on a single-node Postgres Pro Enterprise server, in a standard Postgres Pro Enterprise primary-standby cluster or a BiHA cluster. The extension offers the following features:
Proxy server: proxima becomes a single connection point for client connections and redirects queries to the primary server or the BiHA cluster leader.
Workload distribution: proxima can distribute read-only workloads between cluster replicas.
Connection pooler: proxima manages backend processes to lower consumption of system resources and prevent performance degradation.
SSL support: you can configure proxima to operate with SSL connections.
Monitoring: when using proxima with BiHA, you can view a variety of metrics to track the proxima state.
Dynamic dedicated session: proxima maintains a session between the client and the backend for the entire lifetime of the client connection.
F.70.1.1. Authentication #
The proxima extension uses the same authentication rules that are listed in the pg_hba.conf file. The following authentication methods are supported:
F.70.1.2. Operation in a BiHA Cluster #
The proxima extension offers a special operation mode for a BiHA cluster. When installed in the BiHA cluster, proxima automatically receives all required cluster configuration parameters from biha, such as:
number of cluster nodes
node IDs, addresses, and ports
node roles (leader or follower)
The proxima extension does not operate on referee.
You can either install proxima in the existing BiHA cluster, or enable it when setting up the cluster from scratch.
When used in the BiHA cluster, proxima registers a set of callbacks to be informed about events in the cluster. For example, when the cluster leader changes, proxima receives notification about this event and automatically redirects traffic to the new leader.
F.70.1.3. Workload Distribution #
The proxima extension provides the following ports to process client sessions:
The Proxy-to-Leader (P2L) port redirects all workloads to the primary (leader) node. The default is 4545, which can be changed using the proxima.port parameter.
The Proxy-to-Follower (P2F) port distributes read-only workloads across cluster replicas. The default is 4547, which can be changed using the proxima.p2f_port parameter.
Note
In a BiHA cluster, read-only workloads sent to the P2F port can also be redirected to the leader in the
LEADER_RO
state. For more information about BiHA cluster node states, see biha.status_v.
To distribute workloads across cluster nodes, configure your client application to send write workloads to the P2L port and read-only workloads to the P2F port. For more information, see Configuring Workload Distribution.
Important
proxima attaches a client session to a certain cluster node to ensure that all transactions within this particular session are executed on the same node. If this node fails during the session, the client connection is aborted.
F.70.1.4. Dynamic Dedicated Session #
When executing queries, a backend may keep temporary state within one session and create objects specific for this session, such as temporary tables, temporary views, etc. This information and these objects are not available for other backends. If a connection pooler redirects a client to another backend after such an object is created, the client has no more access to the object. To avoid this issue, proxima offers the dynamic dedicated session functionality that allows to retain sessions between clients and specific backends during the client connection lifetime.
Note
If you use extensions that keep their state within a session, you must manually enable the forced dynamic dedicated session. For more information, see Managing Dynamic Dedicated Sessions Manually.
The proxima extension establishes the dynamic dedicated session automatically if a query contains any of the following objects or SQL functions:
The session remains dedicated until the above mentioned objects are deleted.
The DISCARD
function deletes session objects. After executing DISCARD ALL
, a backend exits both the dynamic dedicated session and forced dynamic dedicated session.
F.70.1.5. Monitoring #
The proxima extension provides the monitoring functionality with a range of metrics to track the proxima state. Monitoring is currently available for BiHA clusters with proxima enabled.
proxima registers the proxima
schema in the biha_db
database and creates there special views that are used to query metrics. For more information on viewing metrics and an example of an SQL query, see Viewing Monitoring Metrics.
F.70.1.5.1. Metrics Transfer Across Cluster Nodes #
To make all cluster metrics available on all nodes, every node transmits its metrics to other nodes once per second. If metrics are not received from a node within ten seconds, for example, due to connection loss or node failure, a query will not return counter values from this node.
F.70.1.5.2. Considerations for Queries via P2L and P2F Ports #
Since metrics are queried using standard SQL, the metrics source is always the node that actually executes the query. When executing a query, consider the following:
When connecting via the P2L port, metrics are retrieved from the cluster leader.
When connecting via the P2F port, proxima may distribute consecutive metrics queries across different nodes. Due to metrics delivery latency, this may cause sequential queries showing decrease of values for integral counters.
For more information about P2L and P2F ports, see Workload Distribution.
To maintain integral counter monotonicity, always take the maximum value between the current and previously observed metrics values during result processing.
F.70.2. Considerations and Limitations #
This section provides a list of considerations and limitations that must be taken into account when using proxima.
The proxima extension does not manage replication between cluster nodes. To provide data consistency on all nodes, either use Built-in High Availability (BiHA), or configure replication manually.
IPv6 is not supported.
Elbrus, ppc64le, and s390x CPU architectures are not supported.
When used in the BiHA cluster, proxima registers a set of callbacks to be informed about events in the cluster. The callbacks are located in the
biha.callbacks
table of thebiha_db
database on BiHA cluster nodes. Do not delete these callbacks, otherwise proxima cannot react to the cluster events.When used in a standard primary-standby cluster or a BiHA cluster, right after the launch proxima may log error messages about connection or synchronization faults. This behavior is normal as nodes launch at different times. The error message logging stops shortly after the launch.
Currently, proxima has no mechanism to automatically establish dynamic dedicated session when you use extensions that keep their state within a session. For such cases, you must manually enable the forced dynamic dedicated session.
To enable the connection pooler to process a large number of client connections, you must set the limit of open files higher than the target number of connections:
sudo echo '* soft nofile
{value}
' >> /etc/security/limits.conf sudo echo '* hard nofile{value}
' >> /etc/security/limits.conf ulimit -n{value}
For example, if the target number of client connections is
10000
, set{value}
to11000
.
F.70.3. Managing proxima #
This section describes how to install, configure, and use proxima.
F.70.3.1. Installation and Configuration #
The proxima extension is a built-in extension included into Postgres Pro Enterprise. To enable and configure proxima, perform the following steps:
Stop your Postgres Pro Enterprise server using pg_ctl:
pg_ctl stop -D
PGDATA_directory
Add proxima to shared_preload_libraries in postgresql.conf:
shared_preload_libraries = 'proxima'
If you have a multi-node cluster, repeat this step on every node.
Depending on your Postgres Pro Enterprise configuration, specify the proxima configuration parameters in postgresql.conf:
For a single-node server, set the following configuration parameter:
proxima.cluster_mode = 'standalone'
For a standard primary-standby cluster, set the following configuration parameters:
proxima.cluster_mode = 'guc' proxima.cluster_config = '
node0_id,node0_address,node0_port,node0_role
;node1_id,node1_address,node1_port,node1_role
;' proxima.cluster_node_id =node_id
For example:
proxima.cluster_mode = 'guc' proxima.cluster_config = '0,192.168.0.57,4590,P;1,192.168.0.77,4590,S;' proxima.cluster_node_id = 0
Ensure that the proxima.cluster_config values are the same on all nodes and proxima.cluster_node_id is unique for each node.
For a BiHA cluster, set the following configuration parameter:
proxima.cluster_mode = 'biha'
When using proxima with biha, specifying cluster configuration parameters is not required as proxima receives this information directly from biha.
(Optional) If required, set other configuration parameters.
If you do not specify configuration parameters, proxima uses the default values. For more information about configuration parameters and their default values, see Section F.70.5.
Start the node using pg_ctl:
pg_ctl start -D
PGDATA_directory
F.70.3.2. SSL Configuration #
F.70.3.2.1. Enabling SSL #
You can enable SSL both for incoming client connections and for internal communication between cluster nodes.
Stop the cluster nodes using pg_ctl.
Using the
OpenSSL
utility, generate a certificate and a private key:openssl req -x509 -newkey rsa:4096 -keyout
path_to_key
-outpath_to_certificate
-sha256 -daysperiod_of_validity
-nodes -subj "/CN=certificate_domain
"For example:
openssl req -x509 -newkey rsa:4096 -keyout ./key.pem -out ./cert.pem -sha256 -days 256 -nodes -subj "/CN=localhost"
In postgresql.conf, specify paths to the generated certificate and private key:
proxima.ssl_key = '
/path/to/key.pem
' proxima.ssl_cert = '/path/to/cert.pem
'If you do not specify the proxima.ssl_cert and proxima.ssl_key parameters, proxima uses a certificate and a key configured for Postgres Pro in
ssl_cert_file
andssl_key_file
.Set the following configuration parameters:
To enable SSL for incoming connections, set the proxima.ssl_enabled to
on
:proxima.ssl_enabled = on
To enable SSL for communication between cluster nodes, set the proxima.p2p_ssl_enabled to
on
:proxima.p2p_ssl_enabled = on
Start the nodes using pg_ctl.
All client connections to the proxima port or internal connections between cluster nodes are now secured with SSL.
F.70.3.2.2. Enabling SSL Authentication for Internal Connections #
Configure SSL authentication for internal connections between cluster nodes. In this procedure, you will configure SSL authentication for a two-node cluster.
Stop the cluster nodes using pg_ctl.
Using the
OpenSSL
utility, generate two public certificate and private key pairs:openssl req -x509 -newkey rsa:4096 -keyout
path_to_key
-outpath_to_certificate
-sha256 -daysperiod_of_validity
-nodes -subj "/CN=certificate_domain
"For example:
openssl req -x509 -newkey rsa:4096 -keyout ./key1.pem -out ./cert1.pem -sha256 -days 256 -nodes -subj "/CN=localhost"
and
openssl req -x509 -newkey rsa:4096 -keyout ./key2.pem -out ./cert2.pem -sha256 -days 256 -nodes -subj "/CN=localhost"
Create a directory and put there the generated certificates.
Execute the following command:
openssl rehash
path_to_the_directory_with_certificates
Set the proxima.p2p_auth_methods and proxima.ssl_trusted_certs_dir configuration parameters as follows:
proxima.p2p_auth_methods = 'ssl' proxima.ssl_trusted_certs_dir =
path_to_the_directory_with_certificates
Start the nodes using pg_ctl.
Check the log to ensure that SSL authentication is successful.
The following log entries indicate that the SSL authentication method is used and that authentication has passed successfully:
[auth]: using SSL authentication [auth]: authentication success
F.70.3.3. Managing Dynamic Dedicated Sessions Manually #
Some extensions keep their state within a session, which requires to retain the session between a client and a backend. If you use such extensions, you must manually enable the dynamic dedicated session at every connection by using the proxima.force_dedicated parameter.
You can manage forced dynamic dedicated sessions as follows:
To enable the forced dynamic dedicated session, execute the following command:
SET proxima.force_dedicated = true;
To disable the forced dynamic dedicated session, execute the following command:
SET proxima.force_dedicated = false;
To check whether the current session is dedicated, use proxima.is_dedicated:
SHOW proxima.is_dedicated;
The returned
t
value means that the session is dedicated, whilef
means that it is not dedicated.
F.70.3.4. Configuring Workload Distribution #
You can configure proxima to distribute workloads across cluster nodes. In the example below, you will configure workload distribution for a three-node cluster with 192.168.0.1
, 192.168.0.2
, and 192.168.0.3
node addresses using the default 4545 (P2L) and 4547 (P2F) proxima ports.
Configure your client application to send write workloads to the P2L port and read-only workloads to the P2F port.
You can use the target_session_attrs parameter on the client side to specify the node type where the client must execute transactions and queries when connecting to the database.
For example, you can specify that queries must be executed on replicas only so within this connection the client will execute read-only queries sent to the P2F port.
(Optional) Specify the distribution algorithm for read-only workloads using the proxima.load_balancer_algorithm parameter.
In the connection string, list the following node addresses in accordance with the format required by your library:
192.168.0.1:4545 192.168.0.1:4547 192.168.0.2:4545 192.168.0.2:4547 192.168.0.3:4545 192.168.0.3:4547
If you use libpq, see Section 35.1.1.3 for the requirements and syntax of specifying multiple addresses in a connection string.
As a result, when connected to port 4545 (P2L), write workloads are redirected to the primary (leader) node. When connected to port 4547 (P2F), read-only workloads are redirected to one of the replicas according to the selected algorithm.
F.70.3.5. Viewing Monitoring Metrics #
You can view monitoring metrics in a BiHA cluster using standard SQL queries on metrics views:
To view all cluster metrics:
SELECT * FROM proxima.proxima_metrics;
To view all metrics of a particular node:
SELECT * FROM proxima.proxima_metrics WHERE node_id =
node_id
;To view metrics of a particular class, execute a query on a corresponding class view. For example:
SELECT * FROM proxima.proxima_metrics_client;
For more information about the monitoring functionality, see Monitoring.
For more information about available metrics, see Monitoring Metrics.
See below for the example of a query to view all metrics of the node with the 0
node ID and its output (which was cut as there are too many metrics):
postgres=# SELECT * FROM proxima.proxima_metrics WHERE node_id = 0; name | class | node_id | value -------------------------------------------+--------------+---------+---------------------- thread.1.active_time_ns | thread-load | 0 | 12590006201 thread.1.purged_coroutines | thread-load | 0 | 6 thread.1.transferred_coroutines_accepted | thread-load | 0 | 2 thread.1.wakeup_requests_accepted | thread-load | 0 | 0 thread.1.futex_wakeup_requests_accepted | thread-load | 0 | 22752 thread.1.active_coroutines_called | thread-load | 0 | 72905 thread.1.evrun_once | thread-load | 0 | 38456 thread.1.evrun_nowait | thread-load | 0 | 8700 thread.1.scheduler.coroctx_in_use | thread-load | 0 | 19 thread.1.scheduler.coroctx_cached | thread-load | 0 | 6 thread.1.scheduler.cs_active | thread-load | 0 | 1 thread.1.scheduler.cs_inactive | thread-load | 0 | 1 thread.1.scheduler.cs_waiting_futex | thread-load | 0 | 5 thread.1.scheduler.cs_wakeup_futex | thread-load | 0 | 0 thread.1.scheduler.cs_waiting_io | thread-load | 0 | 12 memory.lgist-0 | memory | 0 | 0 memory.lgist-1 | memory | 0 | 8 memory.lgist-2 | memory | 0 | 15 memory.lgist-3 | memory | 0 | 162 memory.lgist-4 | memory | 0 | 737 memory.lgist-5 | memory | 0 | 1490 memory.lgist-6 | memory | 0 | 174 memory.lgist-7 | memory | 0 | 295 memory.lgist-8 | memory | 0 | 70 memory.lgist-9 | memory | 0 | 32 memory.lgist-10 | memory | 0 | 20 memory.lgist-11 | memory | 0 | 9 memory.lgist-12 | memory | 0 | 1 memory.lgist-13 | memory | 0 | 71 memory.lgist-14 | memory | 0 | 61 memory.lgist-15 | memory | 0 | 0 memory.lgist-16 | memory | 0 | 6 memory.lgist-17 | memory | 0 | 1 memory.lgist-18 | memory | 0 | 0 memory.lgist-19 | memory | 0 | 0 memory.lgist-20 | memory | 0 | 0 memory.lgist-21 | memory | 0 | 0 memory.lgist-22 | memory | 0 | 1 memory.lgist-23 | memory | 0 | 0 memory.lgist-24 | memory | 0 | 0 memory.usage | memory | 0 | 6581078 memory.traffic | memory | 0 | 7983836 client.lfe.now_connected | client | 0 | 1 client.lfe.accepted | client | 0 | 1 client.lfe.rejected | client | 0 | 0 client.lfe.disconnected | client | 0 | 0 client.lfe.auth_password | client | 0 | 0 client.lfe.auth_md5 | client | 0 | 0 client.lfe.auth_scram_sha256 | client | 0 | 0 client.lfe.auth_trust | client | 0 | 1 client.lfe.auth_tls_accepted | client | 0 | 0 client.lfe.auth_tls_rejected | client | 0 | 0 client.rfe.now_connected | client | 0 | 0 client.rfe.connected | client | 0 | 0 client.rfe.disconnected | client | 0 | 0 client.lbe.enter_dedicated | client | 0 | 0
F.70.3.6. Disabling and Enabling proxima #
You can temporarily disable the proxima extension and then re-enable it using the proxima.enabled configuration parameter:
In the postgresql.conf file, set the
proxima.enabled
parameter as required:To disable proxima, set
proxima.enabled
tooff
:proxima.enabled = off
To enable proxima, set
proxima.enabled
toon
:proxima.enabled = on
Send the SIGHUP signal to the
proxima
process:kill -SIGHUP
proxima_pid
Here
proxima_pid
is the process ID of the theproxima
process.
F.70.3.7. Removing proxima #
To stop using proxima:
On all nodes of the cluster, remove the extension from the shared_preload_libraries in postgresql.conf.
Restart the nodes using
pg_ctl
.
F.70.4. Troubleshooting #
This section contains information about typical issues that you might encounter when using proxima and the ways you can fix them.
F.70.4.1. Database or User Not Found #
The proxima extension performs authentication on a node locally and does not redirect access requests to the primary server or the leader. If a database or a user you are trying to connect to is not found, do the following:
Ensure that replication is configured between cluster nodes by means of the standard replication mechanism or using BiHA.
If replication is configured and turned on, wait for some time as the data may not have reached the replica yet.
F.70.4.2. Failed to Connect to proxima #
If you cannot connect to proxima, do the following:
Ensure that you use proxima.port for connection.
Ensure that the pg_hba.conf file is configured correctly.
F.70.4.3. Log Message: connection cannot be established
#
If you see this message from proxima in the log, do the following:
Ensure that all nodes are on and there are no network issues.
If you use the
guc
cluster operation mode, ensure that the proxima.cluster_node_id values are set up correctly for each node and match the IDs specified in proxima.cluster_config.
F.70.4.4. Log Message: failed to create cluster manager
#
If you use the guc
cluster operation mode, ensure that proxima.cluster_config is configured correctly:
all required values are listed
a set of values for each node ends with a semicolon (
;
)
F.70.4.5. Log Message: postgres: cannot accept new connection, error: Too many open files
#
Increase the maximum number of open files by means of the ulimit -S -n
command.
F.70.5. Configuration Parameters #
F.70.5.1. Common Parameters #
proxima.address
(text
) #Specifies the IP address of the network interface that accepts incoming connections. If set to
0.0.0.0
, the incoming connections are accepted through all network interfaces of the host. The default value is0.0.0.0
. The value of this parameter must be the same on all nodes of the cluster.proxima.enabled
(boolean
) #Temporarily enables and disables proxima. The available values are the following:
off
: temporarily stops the proxima extension.on
: starts the proxima extension after it was temporarily stopped.
proxima.force_dedicated
(boolean
) #Enables and disables forced dynamic dedicated sessions. The available values are the following:
true
: enables the forced dynamic dedicated session.false
: disables the forced dynamic dedicated session.
For more information, see Managing Dynamic Dedicated Sessions Manually.
proxima.is_dedicated
(boolean
) #Allows to check whether the current session is dedicated. Returns one of the following values:
t
: the session is dedicated.f
: the session is not dedicated.
For more information, see Managing Dynamic Dedicated Sessions Manually.
proxima.load_balancer_algorithm
(text
) #Specifies the algorithm for distributing read-only workloads between cluster replicas. The following values are supported:
round-robin
: workloads are redirected to each replica sequentially.random
: workloads are redirected to a random cluster replica.
The default value is
round-robin
.proxima.port
(integer
) #Specifies the Proxy-to-Leader (P2L) port — a TCP port where proxima accepts incoming client connections and transfers all queries to the primary (leader) node.
In a BiHA cluster, queries are only transferred to the leader in the
LEADER_RW
state. For more information about BiHA cluster node states, see biha.status_v.The value can be set in the range from
0
to65535
. The default value is4545
. It is recommended to set the sameproxima.port
on all nodes of the cluster.proxima.p2f_port
(integer
) #Specifies the Proxy-to-Follower (P2F) port — a TCP port where proxima accepts incoming client connections and distribute read-only workloads between cluster replicas.
In a BiHA cluster, read-only workloads can also be transferred to the leader in the
LEADER_RO
state. For more information about BiHA cluster node states, see biha.status_v.The value can be set in the range from
0
to65535
. The default value is4547
.proxima.p2p_auth_methods
(text
) #Specifies authentication methods for internal communication of cluster nodes. The following methods are supported:
The default value is
trust
. You can list multiple values separated by a comma, for example:proxima.p2p_auth_methods = 'trust,ssl'
If you list multiple methods, they are applied in descending order of priority, as follows:
ssl
,trust
.proxima.p2p_ssl_enabled
(boolean
) #Enables or disables SSL for internal connections between cluster nodes. The default value is
off
.proxima.ssl_enabled
(boolean
) #Enables or disables SSL for incoming client connections on proxima.port. The default value is
off
. If you enable SSL without specifying proxima.ssl_key and proxima.ssl_cert, proxima will use the SSL certificate configured for Postgres Pro.proxima.ssl_cert
(text
) #Specifies the path to the certificate that will be used for SSL connection via proxima.port. If not set, proxima uses the certificate specified in ssl_cert_file. If you set this parameter, you must also set proxima.ssl_key.
proxima.ssl_key
(text
) #Specifies the path to the private key file. If not set, proxima uses the key specified in ssl_key_file. If you set this parameter, you must also set proxima.ssl_cert.
proxima.ssl_trusted_certs_dir
(text
) #Specifies the path to a directory with trusted certificates when you configure SSL authentication with proxima.p2p_auth_methods.
proxima.log_level
(enum
) #Sets the log level for proxima messages. The following values are available:
error
,warning
,info
,verbose
,debug
. The default value isinfo
.proxima.workers
(integer
) #Sets the number of threads that are launched for processing queries by the proxima background process. The value can be set in the range from
1
to65535
. The higher the value, the more queries can be processed. The default value is4
.
F.70.5.2. Cluster Configuration #
proxima.cluster_mode
(text
) #Sets the cluster operation mode. The following modes are available:
standalone
mode for a single-node Postgres Pro database.guc
mode for a standard Postgres Pro primary-standby cluster. The cluster configuration is set by the proxima.cluster_config and proxima.cluster_node_id parameters.biha
mode for a BiHA cluster. In this mode, cluster configuration is received from biha.
The default value is
standalone
. The value of this parameter must be the same on all nodes of the cluster.proxima.cluster_config
(text
) #Sets cluster node configuration for the
guc
cluster operation mode. The value of this parameter must be the same on all nodes of the cluster. The parameters for connection and identification are specified for each node as follows:'
node_id
,node_address
,node_port
,node_role
'where:
node_id
is the ID of the node specified as a value from0
to255
. The node ID must be unique for each node of the cluster.node_address
is the IP address of the node for internal connection between cluster nodes.node_port
is the node port for internal connection between cluster nodes. This port must be different from proxima.port.node_role
is the role of the node in the cluster. The available values areP
(Primary) andS
(Standby).
Here is an example of the proxima.cluster_config value for a three-node cluster:
'0,127.0.0.1,4090,P;1,127.0.0.1,4091,S;2,127.0.0.1,4092,S;'
proxima.cluster_node_id
(integer
) #Sets the unique node ID for the
guc
cluster operation mode. This parameter must not be changed once the cluster has started and until it is stopped. The node ID must be unique for each node of the cluster. The value can be set in the range from0
to255
. The default value is0
.The list of node IDs must always start with
0
and end withn-1
, wheren
is the total number of nodes. For example, for a three-node cluster, the node IDs must be0, 1, 2
.
F.70.5.3. Configuration of the Local Backend Pool #
The following parameters set limits for the pool of local backends — Postgres Pro processes responsible for processing client queries.
proxima.backend_pool_local_total_limit
(integer
) #Sets the maximum number of backends that can be created on the node. The value can be set in the range from
1
tomax_connections
. The default value is100
.proxima.backend_pool_local_bucket_limit
(integer
) #Sets the maximum number of backends that can be created within connections identified by the
user + database
binding. The value can be set in the range from1
to the value set in proxima.backend_pool_local_total_limit. The default value is100
.proxima.backend_pool_local_overdue_interval
(float
) #Sets the local backend idle time, in seconds. If a local backend is not used by clients longer than set in this parameter, the backend process will be stopped. The value can be set in the range from
1.0
to86400.0
. The default value is10.0
.
F.70.5.4. Configuration of the Remote Backend Pool #
The following parameters set limits for the pool of remote backends — logical channels that can be established through a multiplexed connection between cluster nodes to proxy queries from a standby (follower) node to the primary (leader) node. Configuring these parameters is reasonable only for multi-node clusters. The default values are optimal, and it is not recommended to change them.
proxima.backend_pool_remote_total_limit
(integer
) #Sets the maximum number of logical connection channels between cluster nodes to proxy client queries. The value can be set in the range from
1
to2^32-1
. The default value is100000
.proxima.backend_pool_remote_bucket_limit
(integer
) #Sets the maximum number of logical connection channels between cluster nodes to proxy client queries within connections identified by the
user + database
binding. The value can be set in the range from1
to the value set in proxima.backend_pool_remote_total_limit. The default value is1000
.proxima.backend_pool_remote_overdue_interval
(float
) #Sets the remote backend idle time, in seconds. If a remote backend is not used by clients longer than set in this parameter, the logical connection channel between cluster nodes will be closed. The value can be set in the range from
1.0
to86400.0
. The default value is60.0
.
F.70.6. Monitoring Metrics #
All proxima monitoring metrics are divided into several classes for convenient querying via metrics views.
The following classes are available:
F.70.6.1. Views for Monitoring Metrics #
When used in a BiHA cluster, proxima registers the proxima
schema in the biha_db
database and creates there views that are used to query metrics.
The following views are available:
proxima.proxima_metrics
displays all cluster metrics.proxima.proxima_metrics_thread_load
displays thread load metrics.proxima.proxima_metrics_traffic
displays traffic metrics.proxima.proxima_metrics_backend_pool
displays backend pool metrics.proxima.proxima_metrics_client
displays client connection metrics.proxima.proxima_metrics_rpc
displays RPC server metrics.
F.70.6.2. Thread Load Metrics #
Thread load metrics display CPU utilization by proxima threads as well as metrics of the coroutine engine that proxima is based on. The number of counters that are used in this class depends on running threads.
Names of thread load metrics have the thread.ID.
, structure, where counter_name
ID
is the internal identifier of the coroutine engine.
The following counters are available:
Table F.52. Thread Load Counters
Name | Type | Description |
---|---|---|
thread.ID.active_time_ns | Integral | Number of nanoseconds during which the thread executed any useful workload. |
thread.ID.purged_coroutines | Integral | Number of coroutine contexts destructed due to prolonged inactivity. |
thread.ID.transferred_coroutines_accepted | Integral | Number of coroutine contexts transferred to this thread for execution. |
thread.ID.wakeup_requests_accepted | Integral | Number of requests for coroutine wakeup accepted. |
thread.ID.futex_wakeup_requests_accepted | Integral | Number of accepted wakeup requests for a coroutine blocked on futex. |
thread.ID.active_coroutines_called | Integral | Number of active coroutine calls. Doubling this value gives the number of coroutine context switches in the current thread. |
thread.ID.evrun_once | Integral | Number of calls to the libev library with thread locking to wait for events related to I/O or active timers. |
thread.ID.evrun_nowait | Integral | Number of calls to the libev library without thread locking to wait for events related to I/O or active timers. |
thread.ID.scheduler.coroctx_in_use | Actual value | Number of coroutine contexts currently in use. |
thread.ID.scheduler.coroctx_cached | Actual value | Number of coroutine contexts currently cached and unused, but available for quick allocation when creating new coroutines. |
thread.ID.scheduler.cs_active | Actual value | Number of active coroutines. |
thread.ID.scheduler.cs_inactive | Actual value | Number of inactive coroutines. |
thread.ID.scheduler.cs_waiting_futex | Actual value | Number of coroutines waiting for futex. |
thread.ID.scheduler.cs_wakeup_futex | Actual value | Number of coroutines in the wakeup queue when locked on futex. |
thread.ID.scheduler.cs_waiting_io | Actual value | Number of coroutines waiting for read or write availability on an I/O device or timer activation (for example, when locked in the sleep operation). |
The example of the thread load metrics query output looks as follows:
name | class | node_id | value ------------------------------------------+-------------+---------+------------ thread.1.active_time_ns | thread-load | 0 | 6319387757 thread.1.purged_coroutines | thread-load | 0 | 1 thread.1.transferred_coroutines_accepted | thread-load | 0 | 2 thread.1.wakeup_requests_accepted | thread-load | 0 | 0 thread.1.futex_wakeup_requests_accepted | thread-load | 0 | 10214 thread.1.active_coroutines_called | thread-load | 0 | 32385 thread.1.evrun_once | thread-load | 0 | 17162 thread.1.evrun_nowait | thread-load | 0 | 4567 thread.1.scheduler.coroctx_in_use | thread-load | 0 | 19 thread.1.scheduler.coroctx_cached | thread-load | 0 | 1 thread.1.scheduler.cs_active | thread-load | 0 | 1 thread.1.scheduler.cs_inactive | thread-load | 0 | 1 thread.1.scheduler.cs_waiting_futex | thread-load | 0 | 5 thread.1.scheduler.cs_wakeup_futex | thread-load | 0 | 0 thread.1.scheduler.cs_waiting_io | thread-load | 0 | 12 thread.2.active_time_ns | thread-load | 0 | 974064 thread.2.purged_coroutines | thread-load | 0 | 0 thread.2.transferred_coroutines_accepted | thread-load | 0 | 0 thread.2.wakeup_requests_accepted | thread-load | 0 | 0 thread.2.futex_wakeup_requests_accepted | thread-load | 0 | 6 thread.2.active_coroutines_called | thread-load | 0 | 7 thread.2.evrun_once | thread-load | 0 | 109 thread.2.evrun_nowait | thread-load | 0 | 0 thread.2.scheduler.coroctx_in_use | thread-load | 0 | 1 thread.2.scheduler.coroctx_cached | thread-load | 0 | 0 thread.2.scheduler.cs_active | thread-load | 0 | 0 thread.2.scheduler.cs_inactive | thread-load | 0 | 0 thread.2.scheduler.cs_waiting_futex | thread-load | 0 | 1 thread.2.scheduler.cs_wakeup_futex | thread-load | 0 | 0 thread.2.scheduler.cs_waiting_io | thread-load | 0 | 0 ...
F.70.6.3. Traffic Metrics #
The traffic metrics display counters of bytes and/or messages that are transferred via the communication channel, as well as the number of dropped messages and reconnections if supported by this communication channel.
Names of traffic metrics have the traffic.CHANNEL.COUNTER
structure, where:
CHANNEL
is the channel, which data is displayed in the query output. The following channels are supported:fe
(frontend): data transfer between all clients and proxima.be
(backend): data transfer between proxima and backends of proxima.rpc
(RPC, Remote Procedure Call): data transfer between proxima and other processes of the current database instance.nodeID.client
: data transfer via the client connection between the current node and the node with theID
identifier.nodeID.server
: data transfer via the server connection between the current node and the node with theID
identifier.
COUNTER
is the name of the counter.
The following counters are available:
Table F.53. Traffic Counters
Name | Type | Description |
---|---|---|
traffic.CHANNEL.rx_bytes | Integral | Number of bytes received via the communication channel. |
traffic.CHANNEL.tx_bytes | Integral | Number of bytes transmitted through the communication channel. |
traffic.CHANNEL.rx_msgs | Integral | Number of messages received. This counter is only present if a channel supports tracking of separate messages. |
traffic.CHANNEL.tx_msgs | Integral | Number of messages transmitted. This counter is only present if a channel supports tracking of separate messages. |
traffic.CHANNEL.rx_msgs_dropped | Integral | Number of dropped messages. This counter is only present if a channel supports message dropping. |
traffic.CHANNEL.reconnects | Integral | Number of reconnections after failures. This counter is only present if a channel supports reconnections. |
The example of the traffic metrics query output looks as follows:
name | class | node_id | value --------------------------------------+---------+---------+---------- traffic.fe.rx_bytes | traffic | 0 | 943 traffic.fe.tx_bytes | traffic | 0 | 10632 traffic.be.rx_bytes | traffic | 0 | 13233 traffic.be.tx_bytes | traffic | 0 | 2099 traffic.node1.client.rx_bytes | traffic | 0 | 32 traffic.node1.client.tx_bytes | traffic | 0 | 64641815 traffic.node1.client.rx_msgs | traffic | 0 | 124 traffic.node1.client.tx_msgs | traffic | 0 | 7868 traffic.node1.client.rx_msgs_dropped | traffic | 0 | 0 traffic.node1.client.reconnects | traffic | 0 | 1 traffic.node2.client.rx_bytes | traffic | 0 | 32 traffic.node2.client.tx_bytes | traffic | 0 | 64609591 traffic.node2.client.rx_msgs | traffic | 0 | 124 traffic.node2.client.tx_msgs | traffic | 0 | 7864 traffic.node2.client.rx_msgs_dropped | traffic | 0 | 0 traffic.node2.client.reconnects | traffic | 0 | 1 traffic.rpc.rx_bytes | traffic | 0 | 100 traffic.rpc.tx_bytes | traffic | 0 | 12416 traffic.rpc.rx_msgs | traffic | 0 | 3 traffic.rpc.tx_msgs | traffic | 0 | 2 traffic.node2.server.rx_bytes | traffic | 0 | 56532348 traffic.node2.server.tx_bytes | traffic | 0 | 32 traffic.node2.server.rx_msgs | traffic | 0 | 7868 traffic.node2.server.tx_msgs | traffic | 0 | 124 traffic.node2.server.rx_msgs_dropped | traffic | 0 | 0 traffic.node2.server.reconnects | traffic | 0 | 1 traffic.node1.server.rx_bytes | traffic | 0 | 56504900 traffic.node1.server.tx_bytes | traffic | 0 | 32 traffic.node1.server.rx_msgs | traffic | 0 | 7864 traffic.node1.server.tx_msgs | traffic | 0 | 124 traffic.node1.server.rx_msgs_dropped | traffic | 0 | 0 traffic.node1.server.reconnects | traffic | 0 | 1
F.70.6.4. Backend Pool Metrics #
Backend pool metrics display characteristics of backend allocation requests from clients to the backend pool.
Names of backend pool metrics have the backend_pool.SCOPE.COUNTER
structure, where:
SCOPE
defines the backend type: local or remote.COUNTER
is the name of the counter.
The following counters are available:
Table F.54. Backend Pool Counters
Name | Type | Description |
---|---|---|
backend_pool.SCOPE.requests | Integral | Number of backend allocation requests to the backend pool. |
backend_pool.SCOPE.creations | Integral | Number or new backends created when requested from the pool. |
backend_pool.SCOPE.destructions | Integral | Number of backend destructions (closures). It may occur when a backend is evicted from the pool bucket or not in use for a long time. |
backend_pool.SCOPE.unlinks | Integral | Number of backends unbound from the pool. |
backend_pool.SCOPE.acquisitions | Integral | Number of backends allocated to clients. |
backend_pool.SCOPE.releases | Integral | Number of backends released by clients and returned to the pool. Due to some failures a backend may be destructed by the client instead of returning to the pool. This will cause increase of the backend_pool.SCOPE.unlinks value, but not backend_pool.SCOPE.releases . |
backend_pool.SCOPE.steals | Integral | Number of backends "stolen" from other pool buckets. Intense growth of this value means that the system experiences high load on many different databases and/or from many different users. This indicates that the pool size is not enough for the current workload, that leads to the system performance degradation. |
backend_pool.SCOPE.errors | Integral | Number of errors occurred when requesting backends from the pool. This is a generalized counter. In some cases, errors may cause termination of client connections. In other cases, an error may lead to a repeated backend request and be transparent for a client. For example, if the pool rejects to allocate a backend due to inability to create it caused by reaching the max_connections limit, the client will wait in a queue. |
backend_pool.SCOPE.request_duration.p* | Percentile (window: T = 60 seconds, C = 1000 events) | Percentiles of time between a backend request to actual backend allocation to a client. This is a cumulative counter and can have significantly different values. For example, allocation of a free backend from the pool may take microseconds, creation of a new backend — milliseconds. However, if the pool is overloaded and clients have to wait for execution of previous request to the database, this may take seconds, minutes, and even hours. In other words, this counter indicates distribution of waiting time in the pool queue before executing the request. |
The example of the backend pool metrics query output looks as follows:
name | class | node_id | value -------------------------------------------+--------------+---------+------------------------ backend_pool.local.requests | backend-pool | 0 | 13 backend_pool.local.creations | backend-pool | 0 | 5 backend_pool.local.destructions | backend-pool | 0 | 3 backend_pool.local.unlinks | backend-pool | 0 | 4 backend_pool.local.acquisitions | backend-pool | 0 | 12 backend_pool.local.releases | backend-pool | 0 | 10 backend_pool.local.steals | backend-pool | 0 | 0 backend_pool.local.errors | backend-pool | 0 | 1 backend_pool.local.request_duration.p0 | backend-pool | 0 | 8.74983775227453e-06 backend_pool.local.request_duration.p5 | backend-pool | 0 | 8.765000496725975e-06 backend_pool.local.request_duration.p25 | backend-pool | 0 | 9.654992595293691e-06 backend_pool.local.request_duration.p50 | backend-pool | 0 | 1.0727600464262677e-05 backend_pool.local.request_duration.p75 | backend-pool | 0 | 1.1514681787272259e-05 backend_pool.local.request_duration.p95 | backend-pool | 0 | 0.008438241305000952 backend_pool.local.request_duration.p100 | backend-pool | 0 | 0.008452788451603185 backend_pool.remote.requests | backend-pool | 0 | 0 backend_pool.remote.creations | backend-pool | 0 | 0 backend_pool.remote.destructions | backend-pool | 0 | 0 backend_pool.remote.unlinks | backend-pool | 0 | 0 backend_pool.remote.acquisitions | backend-pool | 0 | 0 backend_pool.remote.releases | backend-pool | 0 | 0 backend_pool.remote.steals | backend-pool | 0 | 0 backend_pool.remote.errors | backend-pool | 0 | 0 backend_pool.remote.request_duration.p0 | backend-pool | 0 | 0 backend_pool.remote.request_duration.p5 | backend-pool | 0 | 0 backend_pool.remote.request_duration.p25 | backend-pool | 0 | 0 backend_pool.remote.request_duration.p50 | backend-pool | 0 | 0 backend_pool.remote.request_duration.p75 | backend-pool | 0 | 0 backend_pool.remote.request_duration.p95 | backend-pool | 0 | 0 backend_pool.remote.request_duration.p100 | backend-pool | 0 | 0
F.70.6.5. Client Connection Metrics #
Client connection metrics include counters of connections and their characteristics for different communication channels.
Names of client connection metrics have the client.CHANNEL.COUNTER
structure, where:
CHANNEL
is the communication channel, whose data is displayed in the query output. The following channels are supported:lfe
(local frontend): clients that connect to proxima via one of the ports to execute SQL queries.lbe
(local backend): backend processes of Postgres Pro Enterprise created by proxima.rfe
(remote frontend): clients redirected to the current node from another node.rbe
(remote backend): internal communication channels created by proxima to redirect queries to a remote node.rpc
(RPC, Remote Procedure Call): data transfer between proxima and other processes of the current database instance.
COUNTER
is the name of the counter.
The following counters are available:
Table F.55. Client Connection Counters
Name | Type | Description |
---|---|---|
client.lfe.now_connected | Actual value | Number of clients actually connected to the current node. |
client.lfe.accepted | Integral | Number of clients with successfully authenticated connections. |
client.lfe.rejected | Integral | Number of clients with rejected connections, mainly due to authentication errors. |
client.lfe.disconnected | Integral | Number of clients with closed connections. |
client.lfe.auth_password | Integral | Number of clients authenticated using the password method. |
client.lfe.auth_md5 | Integral | Number of clients authenticated using the MD5 method. |
client.lfe.auth_scram_sha256 | Integral | Number of clients with the SCRAM-SHA-256 authentication method applied. |
client.lfe.auth_trust | Integral | Number of clients with trusted connections according to HBA rules. |
client.lfe.auth_tls_accepted | Integral | Number of accepted TLS connections. |
client.lfe.auth_tls_rejected | Integral | Number of rejected TLS connections. |
client.lfe.auth_duration.p* | Percentile (window: T = 60 seconds, C = 1000 events) | Time distribution for client authentication procedure. |
client.lbe.enter_dedicated | Integral | Number of backend connections that entered the dedicated session. |
client.lbe.leave_dedicated | Integral | Number of backend connections that left the dedicated session. |
client.lbe.dedicated_duration.p* | Percentile (window: T = 60 seconds, C = 1000 events) | Time distribution for connections that remained in the dedicated session. |
client.rfe.now_connected | Actual value | Actual number of clients redirected to the current node for query execution. |
client.rfe.connected | Integral | Number of clients redirected to the current node for query execution. |
client.rfe.disconnected | Integral | Number of clients with closed connections. |
client.rbe.enter_dedicated | Integral | Number of backend connections that entered the dedicated session. |
client.rbe.leave_dedicated | Integral | Number of backend connections that left the dedicated session. |
client.rbe.dedicated_duration.p* | Percentile (window: T = 60 seconds, C = 1000 events) | Time distribution for connections that remained in the dedicated session. |
client.rpc.now_connected | Actual value | Number of currently connected clients. |
client.rpc.accepted | Integral | Number of clients with successfully accepted connections including authentication passed. |
client.rpc.rejected | Integral | Number of clients with rejected connections, mainly due to authentication errors. |
client.rpc.disconnected | Integral | Number of clients with closed connections. |
The example of the client connection metrics query output looks as follows:
name | class | node_id | value ------------------------------------+--------+---------+------- client.lfe.now_connected | client | 0 | 1 client.lfe.accepted | client | 0 | 1 client.lfe.rejected | client | 0 | 0 client.lfe.disconnected | client | 0 | 0 client.lfe.auth_password | client | 0 | 0 client.lfe.auth_md5 | client | 0 | 0 client.lfe.auth_scram_sha256 | client | 0 | 0 client.lfe.auth_trust | client | 0 | 1 client.lfe.auth_tls_accepted | client | 0 | 0 client.lfe.auth_tls_rejected | client | 0 | 0 client.rfe.now_connected | client | 0 | 0 client.rfe.connected | client | 0 | 0 client.rfe.disconnected | client | 0 | 0 client.lbe.enter_dedicated | client | 0 | 0 client.lbe.leave_dedicated | client | 0 | 0 client.rbe.enter_dedicated | client | 0 | 0 client.rbe.leave_dedicated | client | 0 | 0 client.rpc.now_connected | client | 0 | 1 client.rpc.accepted | client | 0 | 5 client.rpc.rejected | client | 0 | 0 client.rpc.disconnected | client | 0 | 4 client.lfe.auth_duration.p0 | client | 0 | 0 client.lfe.auth_duration.p5 | client | 0 | 0 client.lfe.auth_duration.p25 | client | 0 | 0 client.lfe.auth_duration.p50 | client | 0 | 0 client.lfe.auth_duration.p75 | client | 0 | 0 client.lfe.auth_duration.p95 | client | 0 | 0 client.lfe.auth_duration.p100 | client | 0 | 0 client.lbe.dedicated_duration.p0 | client | 0 | 0 client.lbe.dedicated_duration.p5 | client | 0 | 0 client.lbe.dedicated_duration.p25 | client | 0 | 0 client.lbe.dedicated_duration.p50 | client | 0 | 0 client.lbe.dedicated_duration.p75 | client | 0 | 0 client.lbe.dedicated_duration.p95 | client | 0 | 0 client.lbe.dedicated_duration.p100 | client | 0 | 0 client.rbe.dedicated_duration.p0 | client | 0 | 0 client.rbe.dedicated_duration.p5 | client | 0 | 0 client.rbe.dedicated_duration.p25 | client | 0 | 0 client.rbe.dedicated_duration.p50 | client | 0 | 0 client.rbe.dedicated_duration.p75 | client | 0 | 0 client.rbe.dedicated_duration.p95 | client | 0 | 0 client.rbe.dedicated_duration.p100 | client | 0 | 0
F.70.6.6. RPC Server Metrics #
The following counters are available:
Table F.56. RPC Server Counters
Name | Type | Description |
---|---|---|
rpc.call_duration.p* | Percentile (window: T = 60 seconds, C = 1000 events) | Time distribution for command execution. |
rpc.err_not_found | Integral | Number of calls to non-existent functions. |
The example of the RPC server metrics query output looks as follows:
name | class | node_id | value ------------------------+-------+---------+------------------------ rpc.call_duration.p0 | rpc | 0 | 4.315190768277686e-05 rpc.call_duration.p5 | rpc | 0 | 4.331642078668621e-05 rpc.call_duration.p25 | rpc | 0 | 6.386338595749277e-05 rpc.call_duration.p50 | rpc | 0 | 7.37059571607683e-05 rpc.call_duration.p75 | rpc | 0 | 8.217731456416661e-05 rpc.call_duration.p95 | rpc | 0 | 0.00011075225182228674 rpc.call_duration.p100 | rpc | 0 | 0.00011117317272816024 rpc.err_not_found | rpc | 0 | 0