Re: could not link file in wal restore lines - Mailing list pgsql-bugs

From Kyotaro Horiguchi
Subject Re: could not link file in wal restore lines
Date
Msg-id 20220725.171132.2272594383346737093.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: could not link file in wal restore lines  (Michael Paquier <michael@paquier.xyz>)
Responses Re: could not link file in wal restore lines  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-bugs
At Sat, 23 Jul 2022 12:36:47 +0900, Michael Paquier <michael@paquier.xyz> wrote in 
> FWIW, the backend code has protections to prevent *exactly* this kind
> of problems when recycling WAL segment files at checkpoints with a set
> of LWLocks taken on the control file, for one.  Perhaps you have
> messed up things and you have finished in such a state that backrest
> writes to pg_wal/ concurrently with a cluster running and running a
> checkpoint, which would explain those link() calls to be failing?

That lock doesn't seem excluding recovery.

I can reproduce with the following script (see below) with some sleep
 is added before (or after) durable_link_or_rename call in
 InstallXlogFileSegment (attached).  Some adjustment might be required
 to reproduce the same on other environment.

=====
2022-07-25 17:05:57.730 JST [151758] LOG:  restored log file "000000010000000000000057" from archive
2022-07-25 17:05:57.760 JST [151758] LOG:  restored log file "000000010000000000000058" from archive
2022-07-25 17:05:57.782 JST [151758] LOG:  restored log file "000000010000000000000059" from archive
2022-07-25 17:05:57.790 JST [151762] LOG:  could not link file "pg_wal/000000010000000000000002" to
"pg_wal/000000010000000000000059":File exists
 
2022-07-25 17:05:57.802 JST [151758] LOG:  restored log file "00000001000000000000005A" from archive
2022-07-25 17:05:58.294 JST [151762] LOG:  could not link file "pg_wal/000000010000000000000003" to
"pg_wal/00000001000000000000005A":File exists
 



========
#! /bin/bash

# create a backup-source
PGDATA=~/test/data
PGARC=~/test/arc
BKDIR=~/test/bk
CPDATA=~/test/dt

rm /tmp/hoge
rm -r $PGDATA $PGARC $BKDIR $CPDATA
mkdir $PGARC
killall -9 postgres

initdb -D $PGDATA
echo "archive_mode=on" >> $PGDATA/postgresql.conf
echo "archive_command = 'cp %p $PGARC/%f'" >> $PGDATA/postgresql.conf

#start the source
pg_ctl -D $PGDATA start

# take a backup
pg_basebackup -D $BKDIR
echo "archive_mode=off" >> $BKDIR/postgresql.conf
echo "restore_command='cp $PGARC/%f %p'" >> $BKDIR/postgresql.conf
touch $BKDIR/recovery.signal

# create archived segments
psql -c 'create table t (a int)'
for i in $(seq 1 100); do psql -c 'insert into t values(0); select pg_switch_wal()'; done

#stop the source
pg_ctl -D $PGDATA stop

# start  recovery
rm -rf $CPDATA
cp -r $BKDIR $CPDATA
touch /tmp/hoge
postgres -D $CPDATA 2>&1 | tee recovery.log
======

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Attachment

pgsql-bugs by date:

Previous
From: Marco Boeringa
Date:
Subject: Re: Fwd: "SELECT COUNT(*) FROM" still causing issues (deadlock) in PostgreSQL 14.3/4?
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: could not link file in wal restore lines