Home > mailing lists

Re: WIP: WAL prefetch (another approach) - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: WIP: WAL prefetch (another approach)
Date	April 26, 2022 09:11:20
Msg-id	CA+hUKGJE6K88=ifPd-2Lt8QJrnS4H+u-btoVCP+i=0Qv=9D2Eg@mail.gmail.com Whole thread Raw
In response to	Re: WIP: WAL prefetch (another approach) (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: WIP: WAL prefetch (another approach) (Thomas Munro <thomas.munro@gmail.com>)
List	pgsql-hackers

Tree view

On Tue, Apr 26, 2022 at 6:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I believe that the WAL prefetch patch probably accounts for the
> intermittent errors that buildfarm member topminnow has shown
> since it went in, eg [1]:
>
> diff -U3 /home/nm/ext4/HEAD/pgsql/contrib/pg_walinspect/expected/pg_walinspect.out
/home/nm/ext4/HEAD/pgsql.build/contrib/pg_walinspect/results/pg_walinspect.out

Hmm, maybe but I suspect not.  I think I might see what's happening here.

> +ERROR:  could not read WAL at 0/1903E40

> I've reproduced this manually on that machine, and confirmed that the
> proximate cause is that XLogNextRecord() is returning NULL because
> state->decode_queue_head == NULL, without bothering to provide an errormsg
> (which doesn't seem very well thought out in itself).  I obtained the

Thanks for doing that.  After several hours of trying I also managed
to reproduce it on that gcc23 system (not at all sure why it doesn't
show up elsewhere; MIPS 32 bit layout may be a factor), and added some
trace to get some more clues.  Still looking into it, but here is the
current hypothesis I'm testing:

1.  The reason there's a messageless ERROR in this case is because
there is new read_page callback logic introduced for pg_walinspect,
called via read_local_xlog_page_no_wait(), which is like the old
read_local_xlog_page() except that it returns -1 if you try to read
past the current "flushed" LSN, and we have no queued message.  An
error is then reported by XLogReadRecord(), and appears to the user.

2.  The reason pg_walinspect tries to read WAL data past the flushed
LSN is because its GetWALRecordsInfo() function keeps calling
XLogReadRecord() until EndRecPtr >= end_lsn, where end_lsn is taken
from a snapshot of the flushed LSN, but I don't see where it takes
into account that the flushed LSN might momentarily fall in the middle
of a record.  In that case, xlogreader.c will try to read the next
page, which fails because it's past the flushed LSN (see point 1).

I will poke some more tomorrow to try to confirm this and try to come
up with a fix.

pgsql-hackers by date:

From: "osumi.takamichi@fujitsu.com"
Date: 26 April 2022, 09:02:46
Subject: RE: Skipping schema changes in publication

From: Laurenz Albe
Date: 26 April 2022, 09:26:59
Subject: Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication

Re: WIP: WAL prefetch (another approach) - Mailing list pgsql-hackers

Previous

Next