Re: pg15b3: recovery fails with wal prefetch enabled - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: pg15b3: recovery fails with wal prefetch enabled
Date
Msg-id CA+hUKGJXfQJHs2jmVOoOo2J12-6m36E0ytDiyrqp-EvFwupvew@mail.gmail.com
Whole thread Raw
In response to Re: pg15b3: recovery fails with wal prefetch enabled  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: pg15b3: recovery fails with wal prefetch enabled
List pgsql-hackers
On Mon, Sep 5, 2022 at 1:28 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> I had this more or less figured out on Friday when I wrote last, but I
> got stuck on a weird problem with 026_overwrite_contrecord.pl.  I
> think that failure case should report an error, no?  I find it strange
> that we end recovery in silence.  That was a problem for the new
> coding in this patch, because it is confused by XLREAD_FAIL without
> queuing an error, and then retries, which clobbers the aborted recptr
> state.  I'm still looking into that.

On reflection, it'd be better not to clobber any pre-existing error
there, but report one only if there isn't one already queued.  I've
done that in this version, which I'm planning to do a bit more testing
on and commit soonish if there are no comments/objections, especially
for that part.

I'll have to check whether a doc change is necessary somewhere to
advertise that maintenance_io_concurrency=0 turns off prefetching, but
IIRC that's kinda already implied.

I've tested quite a lot of scenarios including make check-world with
maintenance_io_concurrency = 0, 1, 10, 1000, and ALTER SYSTEM for all
relevant GUCs on a standby running large pgbench to check expected
effect on pg_stat_recovery_prefetch view and generate system calls.

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: pg15b3: recovery fails with wal prefetch enabled