Thread: BUG #18119: Failed assert while recoverying from pg_basebackup

BUG #18119: Failed assert while recoverying from pg_basebackup

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18119
Logged by:          Bowen Shi
Email address:      zxwsbg@qq.com
PostgreSQL version: 16.0
Operating system:   centos
Description:

Dears,

There may be some problems in recovery. The following step can stably
reproducing the problem:

Firstly, run following script in master. To make sure that we have at
least
20GB data.
1. create table t(a int);
2. echo "insert into t select generate_series(1,5000);">script.sql
3. pgbench --no-vacuum --client=25 -U postgres --transactions=10000 --file
script.sql 

Secondly, using pg_basebackup with stream mode
1. pg_basebackup --checkpoint=fast -h localhost -U postgres -p 5432  -Xs
-Ft
 -v -P -D /data2/sqpg/inst/data_b
2. pgbench --no-vacuum --client=25 -U postgres --transactions=3000 --file
script.sql   (Run this script concurrently during pg_basebackup)

Thirdly, we start the backup instance
cd /data2/sqpg/inst/data_b
tar xvf base.tar
mv pg_wal.tar pg_wal/
cd pg_wal
tar xvf pg_wal.tar
cd ../
echo "port=5433">>postgresql.conf
echo "log_min_messages=debug1">>postgresql.conf
echo "checkpoint_timeout=30s">>postgresql.conf
cd /data2/sqpg/inst/bin
./pg_ctl start -D ../data_b -l logfile_b

This problem was first found in PG 15, see following link for details:

https://www.postgresql.org/message-id/flat/20230227.120101.1600358770821352577.horikyota.ntt%40gmail.com#65fe5aea5862c8196a4ade348c71fde9


And it still exists in PG 16. 

Bowen Shi


Re: BUG #18119: Failed assert while recoverying from pg_basebackup

From
Michael Paquier
Date:
On Wed, Sep 20, 2023 at 01:12:52PM +0000, PG Bug reporting form wrote:
> This problem was first found in PG 15, see following link for details:
>
https://www.postgresql.org/message-id/flat/20230227.120101.1600358770821352577.horikyota.ntt%40gmail.com#65fe5aea5862c8196a4ade348c71fde9
>
> And it still exists in PG 16.

Missing a recovery.signal prevents the startup process from
initializing some structures that may be required once consistency is
reached, in this case some of the facility for standby snapshots.
I've sent a patch about that, as well:
https://commitfest.postgresql.org/44/4244/

The problem is much older than the addition of checkpoints run during
crash recovery.
--
Michael

Attachment