Thread: A question about possible recovery inconsistency
Hello. We had PostgreSQL v11 and used PITR with our database. But PITR behavior was changed in v13 >https://www.postgresql.org/docs/release/13.0/ >Generate an error if recovery does not reach the specified recovery target (Leif Gunnar Erlandsen, Peter Eisentraut) >Previously, a standby would promote itself upon reaching the end of WAL, even if the target was not reached. I tried to change our scripts to include the time where base backup is started (also I tried the time when the base backup was finished). But all my recoveries failed with: `XX000: recovery ended before configured recovery target was reached` So we decided to use `recovery_target_name` instead of `recovery_target_time`. But this attempt also failed with the error above. Backup procedure looks like this: 1. Insert data 2. t3=$(date +"%Y-%m-%d %T") 3. psql -c "select pg_create_restore_point( '$t3' )" 4. wal-g backup-push $PGDATA. This tool lever low level API https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP. 't3' value was used for `recoverty_target_name`. Issue here is that `t3` name was not included into base backup and following WAL files. Logs looks like this: xargs -n1 '-I{}' docker exec test-userdb-restore psql -d userdb -U user_api -c 'INSERT INTO test (num) VALUES ({});' INSERT 0 1 docker exec test-userdb-restore psql -c 'select pg_create_restore_point( '\''2023-10-10 14:48:46'\'' )' pg_create_restore_point ------------------------- 0/5000640 (1 row) wal-g wal-push pg_wal/000000010000000000000006.00000028.backup /dbdata/postgres/pg_wal: total 65540 -rw------- 1 postgres postgres 16777216 Oct 10 14:48 000000010000000000000005 -rw------- 1 postgres postgres 16777216 Oct 10 14:48 000000010000000000000006 -rw------- 1 postgres postgres 364 Oct 10 14:48 000000010000000000000006.00000028.backup -rw------- 1 postgres postgres 16777216 Oct 10 14:48 000000010000000000000007 -rw------- 1 postgres postgres 16777216 Oct 10 14:48 000000010000000000000008 drwx------ 5 postgres postgres 160 Oct 10 14:48 archive_status drwx------ 3 root root 96 Oct 10 14:48 walg_data /dbdata/postgres/pg_wal/archive_status: total 0 -rw------- 1 postgres postgres 0 Oct 10 14:48 000000010000000000000005.done -rw------- 1 root root 0 Oct 10 14:48 000000010000000000000006.00000028.backup.done -rw------- 1 root root 0 Oct 10 14:48 000000010000000000000006.done /dbdata/postgres/pg_wal/walg_data: total 0 drwx------ 2 postgres postgres 64 Oct 10 14:48 walg_archive_status /dbdata/postgres/pg_wal/walg_data/walg_archive_status: total 0 + docker exec test-userdb-restore bash -c 'cat $PGDATA/pg_wal/$(ls -R $PGDATA/pg_wal | grep backup | head -n 1)' START WAL LOCATION: 0/6000028 (file 000000010000000000000006) STOP WAL LOCATION: 0/6000100 (file 000000010000000000000006) CHECKPOINT LOCATION: 0/6000060 BACKUP METHOD: streamed BACKUP FROM: primary START TIME: 2023-10-10 14:48:46 UTC LABEL: 2023-10-10 14:48:46.482532 +0000 UTC m=+0.058620814 START TIMELINE: 1 STOP TIME: 2023-10-10 14:48:47 UTC STOP TIMELINE: 1 When server was stopped couple of WAL files are created: + cd /dbdata/postgres/pg_wal + ls -R .: 000000010000000000000006.00000028.backup 000000010000000000000007 000000010000000000000008 000000010000000000000009 00000001000000000000000A archive_status walg_data ./archive_status: 000000010000000000000006.00000028.backup.done **Created `recovery.signal` and started server** + echo 'recovery_target_name= '\''2023-10-10 14:48:46'\''' + echo 'recovery_target_action= '\''promote'\''' + touch /dbdata/postgres/recovery.signal 2023-10-10 14:48:51.825 UTC [127] database system was interrupted; last known up at 2023-10-10 14:48:46 UTC 2023-10-10 14:48:51.985 UTC [127] LOG: starting point-in-time recovery to "2023-10-10 14:48:46" 2023-10-10 14:48:52.065 UTC [127] LOG: restored log file "000000010000000000000006" from archive 2023-10-10 14:48:52.082 UTC [127] LOG: redo starts at 0/6000028 2023-10-10 14:48:52.141 UTC [127] LOG: consistent recovery state reached at 0/6000100 2023-10-10 14:48:52.141 UTC [124] LOG: database system is ready to accept read-only connections 2023-10-10 14:48:52.188 UTC [127] LOG: redo done at 0/6000100 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.10 s 2023-10-10 14:48:52.188 UTC [127] FATAL: recovery ended before configured recovery target was reached See full logs in `experiment1.txt` attachment. Ok. Some name in the past is not included into the base backup which was done just now. This is expected, so I created the named restore point after base backup. Now logs are: + docker exec test-userdb-restore psql -c 'select pg_create_restore_point( '\''2023-10-10 15:07:37'\'' )' pg_create_restore_point ------------------------- 0/7000090 (1 row) + cat 000000010000000000000006.00000028.backup START WAL LOCATION: 0/6000028 (file 000000010000000000000006) STOP WAL LOCATION: 0/6000100 (file 000000010000000000000006) CHECKPOINT LOCATION: 0/6000060 BACKUP METHOD: streamed BACKUP FROM: primary START TIME: 2023-10-10 15:07:37 UTC LABEL: 2023-10-10 15:07:37.844015 +0000 UTC m=+0.057006433 START TIMELINE: 1 STOP TIME: 2023-10-10 15:07:38 UTC STOP TIMELINE: 1 2023-10-10 15:07:43.451 UTC [128] LOG: starting point-in-time recovery to "2023-10-10 15:07:37" 2023-10-10 15:07:43.525 UTC [128] LOG: restored log file "000000010000000000000006" from archive 2023-10-10 15:07:43.540 UTC [128] LOG: redo starts at 0/6000028 2023-10-10 15:07:43.677 UTC [128] LOG: consistent recovery state reached at 0/6000100 2023-10-10 15:07:43.677 UTC [128] LOG: recovery stopping at restore point "2023-10-10 15:07:37", time 2023-10-10 15:07:38.846288+00 2023-10-10 15:07:43.677 UTC [128] LOG: redo done at 0/7000028 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.13 s 2023-10-10 15:07:43.677 UTC [125] LOG: database system is ready to accept read-only connections 1. Why here (in experiment2.txt) redo done at 0/7000028 when recovery target name "2023-10-10 15:07:37" is at 0/7000090? I suppose 0/7000090 should be included, because `recovery_target_inclusive=true` by default. 2. Is there any way to include a label into a base backup which I can use as `recoverty_target_name`? This is not clear from documentation https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP Is 'label' the name for `recovery_target_name` / `pg_create_restore_point` is called by `pg_backup_start`? 3. Is there any way to get the latest time from a base backup which is reachable and could be used as the value for `recovery_target_time`? As a workaround for XX000 error I inserted one additional record into the database, so a new WAL file is generated. Then I can use the t3 value for `recovery_target_time`. This only works when archive_command/restore_command was configured. But without them it seems I can not use the `recovery_target_time` option. Is this true? Thank you. -- Eugen Konkov DevOps Engineer, Planitar Inc.
Attachment
On Tue, 2023-10-10 at 11:46 -0400, Eugen Konkov wrote: > [wants to avoid > FATAL: recovery ended before configured recovery target was reached > that is issued in v13 and later] > > 1. Why here (in experiment2.txt) redo done at 0/7000028 when recovery > target name "2023-10-10 15:07:37" is at 0/7000090? > I suppose 0/7000090 should be included, because > `recovery_target_inclusive=true` by default. Because there was no transaction at 0/7000090. > 2. Is there any way to include a label into a base backup which I can > use as `recoverty_target_name`? > This is not clear from documentation > https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP > Is 'label' the name for `recovery_target_name` / > `pg_create_restore_point` is called by `pg_backup_start`? No. > 3. Is there any way to get the latest time from a base backup which is > reachable and could be used as the value for `recovery_target_time`? > As a workaround for XX000 error I inserted one additional record into > the database, so a new WAL file is generated. Then I can use the t3 > value for `recovery_target_time`. > This only works when archive_command/restore_command was configured. > But without them it seems I can not use the `recovery_target_time` > option. Is this true? Perhaps you could use the time from the "backup" file in the WAL archive, not sure. But why do you want to do that, if all that you have to do is specify "recovery_target = 'immediate'" to recover to the end of the backup? Yours, Laurenz Albe
>But why do you want to do that, if all that you have to do is specify "recovery_target = 'immediate'" to recover to the end of the backup? Because automation scripts do not know if transactions are available after some point in time or not. But automation scripts know that backup was completed successfully at that point. For example: We want to provide time to recover the database. 1. Base backup restored, wal files are applied successfully if there is a transaction. 2. Base backup restored, wal files are not applied successfully, even if we have the correct wal file after target time. eg. this wal file was created with help: pg_create_restore_point / pg_switch_wal It looks inconsistent, because we can restore save archive by name, but we can not restore it by time, even if this time is less when the named point was created. As workaround we just insert a fake record into database, but this looks very questionable: Why do we need to insert more records into database after successful backup?? On Wed, Oct 11, 2023 at 3:45 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Tue, 2023-10-10 at 11:46 -0400, Eugen Konkov wrote: > > [wants to avoid > > FATAL: recovery ended before configured recovery target was reached > > that is issued in v13 and later] > > > > 1. Why here (in experiment2.txt) redo done at 0/7000028 when recovery > > target name "2023-10-10 15:07:37" is at 0/7000090? > > I suppose 0/7000090 should be included, because > > `recovery_target_inclusive=true` by default. > > Because there was no transaction at 0/7000090. > > > 2. Is there any way to include a label into a base backup which I can > > use as `recoverty_target_name`? > > This is not clear from documentation > > https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP > > Is 'label' the name for `recovery_target_name` / > > `pg_create_restore_point` is called by `pg_backup_start`? > > No. > > > 3. Is there any way to get the latest time from a base backup which is > > reachable and could be used as the value for `recovery_target_time`? > > As a workaround for XX000 error I inserted one additional record into > > the database, so a new WAL file is generated. Then I can use the t3 > > value for `recovery_target_time`. > > This only works when archive_command/restore_command was configured. > > But without them it seems I can not use the `recovery_target_time` > > option. Is this true? > > Perhaps you could use the time from the "backup" file in the WAL archive, > not sure. > > But why do you want to do that, if all that you have to do is specify > "recovery_target = 'immediate'" to recover to the end of the backup? > > Yours, > Laurenz Albe -- Eugen Konkov DevOps Engineer, Planitar Inc. M. 416-276-1715 ekonkove@planitar.com | goiguide.com 560 Parkside Drive, Unit 401 Waterloo, ON, Canada N2L 5Z4
On 10/11/23 09:52, Eugen Konkov wrote:
Doesn't "pg_basebackup --wal-method=stream" already do that?
Since the commands below automagically starts physical streaming, there must be a variation where all the wal data is applied but doesn't start replication.
pg_basebackup --dbname=service=basebackup -D $PGDATA --progress --checkpoint=fast -v \
--write-recovery-conf --wal-method=stream --create-slot --slot=pgstandby1 --compress=server-zstd
pg_ctl start -w
Maybe just remove the two "slot" options:
pg_basebackup --dbname=service=basebackup -D $PGDATA --progress --checkpoint=fast -v \
--write-recovery-conf --wal-method=stream --compress=server-zstd
But why do you want to do that, if all that you have to do is specify"recovery_target = 'immediate'" to recover to the end of the backup? Because automation scripts do not know if transactions are available after some point in time or not. But automation scripts know that backup was completed successfully at that point. For example: We want to provide time to recover the database. 1. Base backup restored, wal files are applied successfully if there is a transaction.
Doesn't "pg_basebackup --wal-method=stream" already do that?
Since the commands below automagically starts physical streaming, there must be a variation where all the wal data is applied but doesn't start replication.
pg_basebackup --dbname=service=basebackup -D $PGDATA --progress --checkpoint=fast -v \
--write-recovery-conf --wal-method=stream --create-slot --slot=pgstandby1 --compress=server-zstd
pg_ctl start -w
Maybe just remove the two "slot" options:
pg_basebackup --dbname=service=basebackup -D $PGDATA --progress --checkpoint=fast -v \
--write-recovery-conf --wal-method=stream --compress=server-zstd
--
Born in Arizona, moved to Babylonia.
Born in Arizona, moved to Babylonia.
On 10/11/23 12:07, Ron wrote:
After that pg_basebackup, do this to get R/W copy of the database:
pg_ctl start
pg_ctl promote
On 10/11/23 09:52, Eugen Konkov wrote:But why do you want to do that, if all that you have to do is specify"recovery_target = 'immediate'" to recover to the end of the backup? Because automation scripts do not know if transactions are available after some point in time or not. But automation scripts know that backup was completed successfully at that point. For example: We want to provide time to recover the database. 1. Base backup restored, wal files are applied successfully if there is a transaction.
Doesn't "pg_basebackup --wal-method=stream" already do that?
Since the commands below automagically starts physical streaming, there must be a variation where all the wal data is applied but doesn't start replication.
pg_basebackup --dbname=service=basebackup -D $PGDATA --progress --checkpoint=fast -v \
--write-recovery-conf --wal-method=stream --create-slot --slot=pgstandby1 --compress=server-zstd
pg_ctl start -w
Maybe just remove the two "slot" options:
pg_basebackup --dbname=service=basebackup -D $PGDATA --progress --checkpoint=fast -v \
--write-recovery-conf --wal-method=stream --compress=server-zstd
After that pg_basebackup, do this to get R/W copy of the database:
pg_ctl start
pg_ctl promote
--
Born in Arizona, moved to Babylonia.
Born in Arizona, moved to Babylonia.