BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while - Mailing list pgsql-bugs
From | postgresql.org@gclough.com |
---|---|
Subject | BUG #13770: Extending recovery_min_apply_delay on Standby causes it to be unavailable for a while |
Date | |
Msg-id | 20151111155006.2644.84564@wrigleys.postgresql.org Whole thread Raw |
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 13770 Logged by: Greg Clough Email address: postgresql.org@gclough.com PostgreSQL version: 9.4.5 Operating system: Oracle Linux v6.7 Description: If you extend the recovery_min_apply_delay on a standby and then restart it, you cannot connect to it for read-only transactions until the most recent transaction time >= delay time. This is problematic if you extend from say 1 hour to 24 hours, as you have to wait almost a day before you can connect to your Standby again. I can't find anything in the documentation to describe this as intended behaviour, and my expectation was that when I extended the delay that the Standby database would come up and be available for read-only transactions, but would simply delay any new transactions until they were older than recovery_min_apply_delay. I'm curious if this was a design decision to stop people getting confused when they access a database before recovery_min_apply_delay has expired and get data that's too recent, or if it was an unintentional consequence of the way it was implemented. Environment: Oracle Linux v6.7 PostgreSQL v9.4.5 (Installed via yum) postgres@gclough[5432] ~$ uname -a Linux gclough 3.8.13-98.4.1.el6uek.x86_64 #2 SMP Wed Sep 23 18:46:01 PDT 2015 x86_64 x86_64 x86_64 GNU/Linux postgres@gclough[5432] ~$ psql --version psql (PostgreSQL) 9.4.5 Test Case: 1. Create a Primary database on port 5432 export PGDATA=/u01/postgres/9.4/5432/data export PGPORT=5432 initdb wal_level = hot_standby # minimal, archive, hot_standby, or logical max_wal_senders = 1 # max number of walsender processes 2. Replicate it to a Standby database on port 5433 export PGDATA=/u01/postgres/9.4/5433/data export PGPORT=5433 port = 5433 # (change requires restart) wal_level = hot_standby # minimal, archive, hot_standby, or logical max_wal_senders = 1 # max number of walsender processes hot_standby = on # "on" allows queries during recovery 3. On the Standby (Port 5433), introduce a 2min time delay with "recovery_min_apply_delay" in the recovery.conf: standby_mode = 'on' primary_conninfo = 'port=5432 host=localhost user=postgres application_name=port_5433' recovery_min_apply_delay = 2min 4. Check the current lag on the Standby (Port 5433): postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:32:32 GMT 2015 receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering --------------+-------------+-------------------------------+------------ 0/7000000 | 0/7000000 | | t (1 row) 5. Create a small test table on the Primary (Port 5432): CREATE TABLE delay_test (id NUMERIC); INSERT INTO delay_test VALUES (1); 6. Check the application of WAL on the Standby (Port 5433), and note the delay: postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:33:06 GMT 2015 receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering --------------+-------------+-------------------------------+------------ 0/7015DC8 | 0/7015890 | | t (1 row) postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:34:00 GMT 2015 receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering --------------+-------------+-------------------------------+------------ 0/7015E00 | 0/7015890 | | t (1 row) postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:35:01 GMT 2015 receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering --------------+-------------+-------------------------------+------------ 0/7015E00 | 0/7015E00 | 2015-11-11 15:32:53.950376+00 | t (1 row) 7. Reset the time delay on the Standby from 2 to 5 minutes recovery_min_apply_delay = 5min pg_ctl restart -m fast 8. Try to connect to the Standby, and get a failure: postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:35:26 GMT 2015 psql: FATAL: the database system is starting up postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:36:01 GMT 2015 psql: FATAL: the database system is starting up postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:37:00 GMT 2015 psql: FATAL: the database system is starting up 9. Wait 3 minutes... then it works: postgres@gclough[5433] /u01/postgres/9.4/5433/data$ date; psql -c 'select pg_last_xlog_receive_location() "receive_xlog", pg_last_xlog_replay_location() "replay_xlog", pg_last_xact_replay_timestamp(), pg_is_in_recovery() "recovering"' Wed Nov 11 15:38:00 GMT 2015 receive_xlog | replay_xlog | pg_last_xact_replay_timestamp | recovering --------------+-------------+-------------------------------+------------ 0/7015ED8 | 0/7015ED8 | 2015-11-11 15:32:53.950376+00 | t (1 row)
pgsql-bugs by date: