Postgres Pro Enterprise : Documentation: 17: 27.3. Administration : Postgres Professional

27.3. Administration
Prev	Up	Chapter 27. Built-in High Availability (BiHA)	Home	Next

27.3. Administration #

27.3.1. Changing Cluster Composition
27.3.2. Changing Configuration Parameters
27.3.3. Switchover
27.3.4. Managing SSL
27.3.5. Roles
27.3.6. Restoring the Node from the NODE_ERROR State
27.3.7. Monitoring the Rewind Results
27.3.8. Configuration of Quorum-Based Synchronous and Asynchronous Replication
27.3.9. Logging
27.3.10. Callbacks
27.3.11. Managing proxima
27.3.12. Recovering from a Backup
27.3.13. Migration
27.3.14. Removing biha

27.3.1. Changing Cluster Composition #

You can change the cluster composition as follows:

To add a node, use the bihactl add command with the relevant options.
To remove a node, use the biha.remove_node function.
To change the leader manually, use the biha.set_leader function. For more information, see Switchover.

27.3.2. Changing Configuration Parameters #

You can change cluster configuration parameters as follows:

Some BiHA configuration parameters are common for all cluster nodes. They must have the same value on all cluster nodes and can only be set up on the leader node by special functions. For example, use the biha.set_heartbeat_max_lost function to set the biha.heartbeat_max_lost parameter value:
```
SELECT biha.set_heartbeat_max_lost(7);
```
All available functions for setting up common cluster configuration parameters are listed in Common Cluster Configuration.
The BiHA parameters that can vary on different cluster nodes can be set with the ALTER SYSTEM command. For example, you can change the biha.can_vote parameter with ALTER SYSTEM:
```
ALTER SYSTEM SET biha.can_vote = true;
SELECT pg_reload_conf();
```

For more information about available BiHA configuration parameters and the ways they can be set up, see Section 27.4.1.

For more information about decreasing Postgres Pro configuration parameter values, see Postgres Pro Configuration.

27.3.3. Switchover #

In addition to the built-in failover capabilities, the high-availability cluster in Postgres Pro allows for the switchover. The difference between failover and switchover is that the former is performed automatically when the leader node fails and the latter is done manually by the system administrator. To switch over the leader node, use the biha.set_leader function. When you set the new leader, the following happens:

All attempts to perform elections are blocked and the timeout is set.
The current leader node becomes the follower node.
The newly selected node becomes the new leader.
If the switchover process does not end within the established timeout, the selected node becomes the follower and new elections are performed to choose the new cluster leader.

27.3.4. Managing SSL #

You can enable or disable SSL to secure service connections in the initialized BiHA cluster by means of the biha.use_ssl configuration parameter.

Prepare a certificate and key pair.
To enable SSL, on all cluster nodes, set biha.use_ssl to true:
```
ALTER SYSTEM SET biha.use_ssl = true;
```
To disable SSL, set the value to false.
Stop and start the nodes using pg_ctl.
Important
The leader node must be the last to stop and first to start. Since enabling SSL in a BiHA cluster involves using the standard TLS handshake process, it is recommended to minimize the time lag between stopping and starting nodes.

27.3.5. Roles #

When you initialize the high-availability cluster, the biha_db database is created as well as the biha extension is created in the biha scheme of the biha_db database. Besides, the following roles are created and used:

BIHA_CLUSTER_MANAGEMENT_ROLE allows execution of all biha extension functions.
BIHA_REPLICATION_ROLE is used when running pg_rewind and pg_probackup.
biha_replication_user is a member of the BIHA_REPLICATION_ROLE and BIHA_CLUSTER_MANAGEMENT_ROLE roles and can request both client and replication connections. It is used by the bihactl utility as well as when the follower node is connected to the leader node. This role owns the biha_db database. The password for the biha_replication_user role in the password file must be the same on all nodes of the BiHA cluster. The password for the role is prompted upon the BiHA cluster creation.
biha_callbacks_user is the default user for callback execution. This user can connect to a database but has no privileges.
The predefined pg_monitor is used to monitor the state of the BiHA cluster.

27.3.6. Restoring the Node from the `NODE_ERROR` State #

Errors occurred in biha or server instance processes listed below may cause the node failure, i.e. it will not be able to restart and will damage the WAL:

A rewind automatically performed by biha using pg_rewind if node timelines diverge.
The walreceiver process in case of timeline divergence. The follower node WAL may be partially rewritten by the WAL received from the leader node.

Note

If the NODE_ERROR state is caused by other issues, for example, replication slot overflow, you can fix the error root cause and call the biha.reset_node_error function to reset the NODE_ERROR state.

When the node goes into the NODE_ERROR state, the WAL recovery is paused and the walreceiver process is stopped. Nodes in the NODE_ERROR state are unable to vote during elections. Reading from such nodes is prohibited. Besides, the error details are saved to the biha.state file and checked upon the node restart, so the node will go into the same state when the biha-background-worker process is launched.

To restore the node from the NODE_ERROR state, take the following steps:

Save the most recent files from the pg_wal directory, since some of the files unique to this node will be rewritten by pg_rewind.
To save biha configuration files, run pg_rewind with the --biha option, for example:
```
pg_rewind --biha --target-pgdata=path_to_PGDATA_of_the_NODE_ERROR_node --source-server='user=biha_replication_user host=leader_host port=leader_port dbname=postgres'
```
Important
The --biha option is essential for saving biha configuration files. Using pg_rewind in a BiHA cluster without --biha option may cause cluster configuration inconsistency.
If the rewind has been successful, information about the NODE_ERROR state is deleted from the biha.state file. Besides, when you specify the connection string in the --source-server option of pg_rewind, it also automatically saves this string for the primary_conninfo configuration parameter in the postgresql.auto.conf file. This is important for the node to continue restoring after the restart and reach the consistency point, which is the number of the last record in the source server WAL at the time of the rewind.
(Optional) If the node was offline for a long time, to prevent the risk of data corruption and obsolete data reads, set the biha.flw_ro parameter of the restored node to off.

27.3.7. Monitoring the Rewind Results #

You can check the results of the rewind, i.e. the rewind state of cluster nodes, in the rewind_state field of the biha.state file. The field contains the enum values, which are interpreted as follows:

Table 27.1. The Rewind State

Value	Interpretation
`0`	The rewind is not required.
`1`	The server is stopped, the biha.autorewind configuration parameter was enabled, the rewind will be performed after the server restart.
`2`	The rewind failed.
`3`	The rewind was performed successfully.
`4`	The biha.autorewind configuration parameter was not enabled, and the rewind must be performed manually as described in Restoring the Node from the NODE_ERROR State.

27.3.8. Configuration of Quorum-Based Synchronous and Asynchronous Replication #

BiHA allows you to create a cluster with quorum-based synchronous replication.

When you create a BiHA cluster from scratch or convert the existing cluster, you can configure quorum-based synchronous replication using the --sync-standbys option of bihactl init. This will set the synchronous_standby_names parameter by specifying the number of quorum-based synchronous nodes with the ANY method, i.e. the number of any synchronous nodes that transaction commits will wait for replies from. For more information, see Section 26.2.8.2. For example, when setting the --sync-standbys value to 2 for a three-node cluster, synchronous_standby_names looks as follows:

ANY 2 (biha_node_1,biha_node_2,biha_node_3)

Also, the synchronous_commit parameter is used with the default value on. Note that you cannot select specific nodes for synchronous replication, but only set the number of quorum-based synchronous nodes by specifying this number in the --sync-standbys option. Other nodes in the cluster will be replicated asynchronously, since streaming replication is asynchronous by default.

27.3.8.1. Configuring Quorum-Based Synchronous Replication on the Existing BiHA Cluster #

If your BiHA cluster is asynchronous, i.e. you have not specified the --sync-standbys option in the bihactl init command, use the biha.set_sync_standbys function to enable quorum-based synchronous replication on your existing cluster. The function sets the synchronous_standby_names parameter and specifies the number of quorum-based synchronous nodes with the ANY method.

You can also use the biha.set_sync_standbys function to modify the number of quorum-based synchronous nodes of the synchronous_standby_names parameter after you change BiHA cluster composition by adding nodes with the bihactl add command or removing nodes with the biha.remove_node function.

27.3.8.2. Relaxing Synchronous Replication Restrictions #

You can relax synchronous replication restrictions to allow the leader node to continue operation while some of the quorum-based synchronous nodes are temporary unavailable.

To enable relaxed synchronous replication, specify the minimum number of quorum-based synchronous nodes in the --sync-standbys-min option. If set, the option also changes the MIN field value of the synchronous_standby_names parameter accordingly. The synchronous_standby_gap parameter remains unchanged and keeps its default value — 0. For example, when setting the --sync-standbys-min value to 0 for a three-node cluster, synchronous_standby_names looks as follows:

ANY 2 MIN 0 (biha_node_1,biha_node_2,biha_node_3)

You can set the --sync-standbys-min value when initializing the BiHA cluster by means of the bihactl init command, and modify it later with the biha.set_sync_standbys_min function.

27.3.8.3. Synchronous Replication and the Referee #

There are a few points that need to be taken into account with regard to synchronous replication and the referee node. If the --mode option is set to referee, the referee does not participate in synchronous replication. When set to referee_with_wal, the node can synchronously receive data. This mode allows the cluster to continue to be available in the 2+1 configuration with --sync-standbys=1 if the follower node is down and the referee node starts confirming transactions for the leader node. The referee behavior depends on the synchronous_commit parameter value. Note that with this parameter set to remote_apply the referee does not confirm transactions.

27.3.9. Logging #

biha logs messages sent by its components, i.e. the control channel and the node controller. The control channel is used to exchange service information between the nodes and is marked as BCP in the log. The node controller is the biha core component, which is responsible for the node operation logic and is marked as NC in the log. You can determine the types of messages to be logged by setting the appropriate logging level. biha supports both standard Postgres Pro message severity levels and the extension logging levels.

Postgres Pro message severity levels are used by biha in the following cases:

A biha process ends with an error (ERROR and FATAL levels).
A biha component is not covered by any logging level of the extension.
A message should be logged when component logging is disabled. For example, LOG level messages sent by the control channel, which are displayed only when the component is successfully initialized.

biha logging levels are mapped to the Postgres Pro message severity levels. If you want messages of the required level to appear in the log, the value set for this level should correspond to the value in log_min_messages. You can configure biha logging levels by specifying the corresponding configuration parameters in the postgresql.biha.conf file.

The recommended configuration of logging levels looks as follows:

biha.BcpTransportWarn_log_level = WARNING
biha.BcpTransportLog_log_level = LOG
biha.BcpTransportDetails_log_level = DEBUG1
biha.BcpTransportDebug_log_level = DEBUG1
biha.BcpTransportSSLDebug_log_level = DEBUG2
biha.NodeControllerLog_log_level = LOG
biha.NodeControllerWarn_log_level = WARNING
biha.NodeControllerDetails_log_level = DEBUG1
biha.NodeControllerDebug_log_level = DEBUG1
log_min_messages = DEBUG1

27.3.10. Callbacks #

A callback is an SQL function that notifies users or external services about events in the BiHA cluster, for example, about election of a new leader or change of cluster configuration. As a user, you create an SQL function and register it as a callback. Under certain conditions, the biha extension calls this function.

Timely alerting helps external services to provide proper reaction to events in the BiHA cluster. For example, after receiving information about the leader change, the proxy server redirects traffic to the new leader.

The following callback types are available:

Table 27.2. Callback Types

Name	Description
`CANDIDATE_TO_LEADER`	Called on the node elected as the new leader. Signature: my_callback(void) RETURNS void
`LEADER_TO_FOLLOWER`	Called on the old leader returned to the cluster after demotion. Signature: my_callback(void) RETURNS void
`LEADER_CHANGED`	Called on every node when the BiHA cluster leader changes. Signature: my_callback(id integer, host text, port integer) RETURNS void In this signature, biha passes the new leader details: ID, host name, and node port number.
`LEADER_CHANGE_STARTED`	Called on all nodes when the old leader is not available, but the new leader is not yet elected. You can use this callback to fence the old leader. The callback is activated when the number of nodes reaches the biha.nquorum value, and it becomes possible to hold elections. This callback is also called on all cluster nodes when the new leader is set manually using the biha.set_leader function. Important The callback may fail to be called on the old leader because after setting the new leader by biha.set_leader, the old leader immediately restarts. Signature: my_callback(id integer, host text, port integer) RETURNS void In this signature, biha passes the lost leader details: ID, host name, and node port number.
`LEADER_STATE_IS_RW`	Called on the leader and other nodes when the leader changes its state to `LEADER_RW`. Signature: my_callback(id integer, host text, port integer) RETURNS void In this signature, biha passes to the callback the leader details: ID, host name, and node port number.
`TERM_CHANGED`	Called on the node when its `term` value increases, for example, when failover or switchover happens, nodes are added or removed, as well as when parameters are changed by the `biha.set_*` functions. Signature: my_callback(old_term integer, new_term integer) RETURNS void In this signature, biha passes the old and the new `term` values.
`OFFERED_TO_LEADER`	Called on the node manually set to leader when it is about to become the leader. Signature: my_callback(void) RETURNS void
`NODE_ADDED`	Called on every node when a new node is added to the BiHA cluster. Signature: my_callback(id integer, host text, port integer, mode integer) RETURNS void In this signature, biha passes to the callback the new node details, such as its host name, node port number, and operation mode: `regular`, `referee`, or `referee_with_wal`. For more information, see The Referee Node in the BiHA Cluster.
`NODE_REMOVED`	Called on every node when a node is removed from the BiHA cluster. Signature: my_callback(id integer) RETURNS void In this signature, biha passes to the callback the ID of the removed node.

27.3.10.1. Considerations and Limitations #

When an event happens, callbacks are executed in sequence. At this time, biha cannot change its state, for example, initiate elections.
If callback execution takes longer than the biha.callbacks_timeout value, biha stops callback execution and continues normal operation.
On clusters with asynchronous replication, the biha.register_callback function does not wait for all nodes to receive callbacks. This may lead to a situation where callbacks are present on the leader, but not present on the follower as it lags behind.
Normally, callbacks are not executed on the referee in the referee mode. However, if you have registered callbacks on the leader before adding the referee, callbacks may be executed on the referee and cannot be removed from it.

Note

Do not call biha functions inside of a callback as this can cause unexpected behavior.

27.3.10.2. Managing Callbacks #

To manage callbacks, you can perform the following actions:

register one or several callbacks for a single event
view the list of registered callbacks
unregister callbacks

Registering Callbacks

Write an SQL function and then register it as a callback using biha.register_callback.

In this example, you create several SQL functions in PL/pgSQL and register them as different callback types.

Ensure that the leader node of your BiHA cluster is in the LEADER_RW state.
On the leader node, use psql and connect to the biha_db database:
```
postgres=# \c biha_db
```

Create the following callback functions:

-- log the node term change
CREATE FUNCTION log_term_changed(old_term integer, new_term integer)
RETURNS void AS $$
BEGIN
    RAISE LOG 'Callback: Term changed from % to %', old_term, new_term;
END;
$$ LANGUAGE plpgsql;

-- log the election of the new leader
CREATE FUNCTION log_leader_changed(id integer, host text, port integer)
RETURNS void AS $$
BEGIN
    RAISE LOG 'Callback: New leader is % %:%', id, host, port;
END;
$$ LANGUAGE plpgsql;

-- log that the leader was demoted
CREATE FUNCTION log_leader_to_follower()
RETURNS void AS $$
BEGIN
    RAISE LOG 'Callback: demote';
END;
$$ LANGUAGE plpgsql;

SELECT biha.register_callback('TERM_CHANGED', 'log_term_changed', 'biha_db');

SELECT biha.register_callback('LEADER_CHANGED', 'log_leader_changed', 'biha_db');

SELECT biha.register_callback('LEADER_TO_FOLLOWER', 'log_leader_to_follower', 'biha_db');

You can also specify the user on which behalf the callbacks are executed or determine the order of callbacks execution. For more information, see the description of the biha.register_callback function.

Viewing Callbacks

Registered callbacks are added to the biha.callbacks table located in the biha_db database.

To view all registered callbacks:

On the leader node, use psql and connect to the biha_db database:
```
postgres=# \c biha_db
```

Display the content of the biha.callbacks table:

SELECT * FROM biha.callbacks;

1 | log_term_changed
2 | log_leader_changed
3 | log_leader_to_follower
(3 rows)

Unregistering Callbacks

An unregistered callback is deleted from the biha.callbacks table.

Ensure that the leader node of your BiHA cluster is in the LEADER_RW state.
On the leader node, use psql and connect to the biha_db database:
```
postgres=# \c biha_db
```
Get an ID of the callback that you want to unregister, for example, log_leader_changed:
```
SELECT id FROM biha.callbacks WHERE func = 'log_leader_changed';
```
The callback ID is returned, for example, 2.
Unregister the callback:
```
SELECT biha.unregister_callback(2);
```
The callback is now unregistered.

27.3.11. Managing proxima #

You can enable or disable the proxima extension in your existing BiHA cluster, as well as check its current state.

Enabling and Disabling proxima

On the leader node, from the biha_db database, call one of the following functions:
- To enable proxima, call the biha.enable_proxima function:
```
SELECT biha.enable_proxima();
```
- To disable proxima, call the biha.disable_proxima function:
```
SELECT biha.disable_proxima();
```
Stop and start all cluster nodes, except the referee, using pg_ctl.
The leader node must be the last to stop and first to start. Ensure that you minimize the time lag between stopping and starting nodes.

Checking the proxima Status

On any node of your BiHA cluster, execute the following command:
```
SELECT biha.proxima_status;
```
The result displays the current status of the proxima extension on this node. For more information about available status values, see biha.proxima_status.

27.3.12. Recovering from a Backup #

If your database instance was restored from one of the nodes of the BiHA cluster to a separate node and/or using Point-in-Time Recovery (PITR), there must be no connection between the restored node and the operating BiHA cluster nodes. To prevent the connection, take the following steps on the restored node before you start it:

Remove the include 'postgresql.biha.conf' include directive from the postgresql.conf configuration file.
Ensure that biha is not present in the shared_preload_libraries of the postgresql.conf file and, if applicable, of the postgresql.auto.conf file.

If you want to add the restored node to the cluster, take the following steps:

On the restored node, manually configure streaming replication from the leader.
Synchronize the restored node with the leader.

Stop the restored node using pg_ctl:

pg_ctl stop -D restored_node_PGDATA_directory

Add the restored node to the cluster using the bihactl add command with the --convert-standby option.

Start the restored node using pg_ctl:

pg_ctl start -D restored_node_PGDATA_directory

27.3.13. Migration #

Depending on your current Postgres Pro Enterprise version, use one of the following procedures to upgrade your BiHA cluster:

To migrate from Postgres Pro Enterprise version 16.X to version 17.X, see Migrating the BiHA Cluster from Version 16.X to Version 17.X.
To migrate between minor versions of Postgres Pro Enterprise 17, see Migrating the BiHA Cluster in the LEADER_RW State.

27.3.13.1. Migrating the BiHA Cluster from Version 16.X to Version 17.X #

To upgrade your BiHA cluster from Postgres Pro Enterprise version 16.X to version 17.X, perform the following steps:

Completely remove biha.
Stop all nodes of the cluster using the pg_ctl command.
Upgrade the nodes using the instructions described in the pg_upgrade utility documentation.
Convert the primary node into the leader.
Add followers using one of the following ways:
- If you have used rsync to upgrade your standby nodes, convert standbys into followers.
- If you have not used rsync, add followers from scratch.
(Optional) If required, recreate the referee.

27.3.13.2. Migrating the BiHA Cluster in the `LEADER_RW` State #

Use this procedure for the following migrations:

from versions 17.2 and 17.4 to version 17.5
from version 17.2 to version 17.4

Stop one of the followers using the pg_ctl command.
Upgrade the follower.
Start the follower using the pg_ctl command.
Promote the upgraded follower using the biha.set_leader function.
Stop, upgrade, and start the remaining followers and the old leader.
Promote the old leader node using the biha.set_leader function.

27.3.14. Removing biha #

You can either temporarily turn off, or completely remove the biha extension.

Turning off biha

To turn off the extension:

Remove the include 'postgresql.biha.conf' include directive from the postgresql.conf configuration file.
Ensure that biha is not present in the shared_preload_libraries of the postgresql.conf file and, if applicable, of the postgresql.auto.conf file.

Removing biha completely

To remove the extension completely, do the following:

Turn off biha.
Execute the DROP EXTENSION command. It must be executed on the leader in the LEADER_RW state and from the biha_db database:
```
biha_db=# DROP EXTENSION biha;
```

Prev	Up	Next
27.2. Setting Up a BiHA Cluster	Home	27.4. Reference for the biha Extension

epub pdf

27.3. Administration #

27.3.1. Changing Cluster Composition #

27.3.2. Changing Configuration Parameters #

27.3.3. Switchover #

27.3.4. Managing SSL #

Important

27.3.5. Roles #

27.3.6. Restoring the Node from the NODE_ERROR State #

Note

Important

27.3.7. Monitoring the Rewind Results #

27.3.8. Configuration of Quorum-Based Synchronous and Asynchronous Replication #

27.3.8.1. Configuring Quorum-Based Synchronous Replication on the Existing BiHA Cluster #

27.3.8.2. Relaxing Synchronous Replication Restrictions #

27.3.8.3. Synchronous Replication and the Referee #

27.3.9. Logging #

27.3.10. Callbacks #

Important

27.3.10.1. Considerations and Limitations #

Note

27.3.10.2. Managing Callbacks #

27.3.11. Managing proxima #

27.3.12. Recovering from a Backup #

27.3.13. Migration #

27.3.13.1. Migrating the BiHA Cluster from Version 16.X to Version 17.X #

27.3.13.2. Migrating the BiHA Cluster in the LEADER_RW State #

27.3.14. Removing biha #

27.3.6. Restoring the Node from the `NODE_ERROR` State #

27.3.13.2. Migrating the BiHA Cluster in the `LEADER_RW` State #