Re: prevent immature WAL streaming - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: prevent immature WAL streaming
Date
Msg-id 202111232040.fzyfnrdwtxu6@alvherre.pgsql
Whole thread Raw
In response to Re: prevent immature WAL streaming  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: prevent immature WAL streaming
List pgsql-hackers
On 2021-Nov-23, Tom Lane wrote:

> We're *still* not out of the woods with 026_overwrite_contrecord.pl,
> as we are continuing to see occasional "mismatching overwritten LSN"
> failures, further down in the test where it tries to start up the
> standby:

Augh.

> Looking at adjacent successful runs, it seems that the exact point
> where the "missing contrecord" starts varies substantially, even after
> our previous fix to disable autovacuum in this test.  How could that be?

Well, there is intentionally some variability.  Maybe not as much as one
would wish, but I expect that that should explain why that point is not
always the same.

> It's probably for the best though, because I think this is exposing
> an actual bug that we would not have seen if the start point were
> completely consistent.  I have not dug into the code, but it looks to
> me like if the "consistent recovery state" is reached exactly at a
> page boundary (0/1FFE000 in all these cases), then the standby expects
> that to be what the OVERWRITE_CONTRECORD record will point at.  But
> actually it points to the first WAL record on that page, resulting
> in a bogus failure.

So what is happening is that we set state->overwrittenRecPtr to the LSN
of page start, ignoring the page header.  Is that the LSN of the first
record in a page?  I'll see if I can reproduce the problem.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"La persona que no quería pecar / estaba obligada a sentarse
 en duras y empinadas sillas    / desprovistas, por cierto
 de blandos atenuantes"                          (Patricio Vogel)



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: pg_upgrade parallelism
Next
From: Melanie Plageman
Date:
Subject: Re: Avoiding smgrimmedsync() during nbtree index builds