Re: Corner-case bug in pg_rewind - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Corner-case bug in pg_rewind
Date
Msg-id 1713707e-e318-761c-d287-5b6a4aa807e8@iki.fi
Whole thread Raw
In response to Re: Corner-case bug in pg_rewind  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On 04/12/2020 00:16, Heikki Linnakangas wrote:
> On 03/12/2020 16:10, Heikki Linnakangas wrote:
>> On 02/12/2020 15:26, Ian Barwick wrote:
>>> On 02/12/2020 20:13, Heikki Linnakangas wrote:
>>>> Attached are two patches. The first patch is your original patch, unmodified
>>>> (except for a cosmetic rename of the test file). The second patch builds on
>>>> that, demonstrating and fixing the issue I mentioned. It took me a while to
>>>> create a repro for it, it's easily masked by incidental full-page writes or
>>>> because rows created by XIDs that are not marked as committed on the other
>>>> timeline are invisible, but succeeded at last.
>>>
>>> Aha, many thanks. I wasn't entirely sure what I was looking for there and
>>> recently haven't had the time or energy to dig any further.
>>
>> Ok, pushed and backpatched this now.
> 
> The buildfarm is reporting sporadic failures in the new regression test.
> I suspect it's because of timing issues, where a server is promoted or
> shut down before some data has been replicated. I'll fix that tomorrow
> morning.

Fixed, I hope. It took me a while to backpatch, because small 
differences were needed in almost all versions, because some helpful TAP 
test helpers like waiting for a standby to catchup are not available in 
backbranches.

There was one curious difference between versions 9.6 and 10. In v10, 
you can perform a "clean switchover" like this:

1. Shut down primary (A) with "pg_ctl -m fast".

2. Promote the standby (B) with "pg_ctl promote".

3. Reconfigure the old primary (A) as a standby, by creating 
recovery.conf that points to the promoted server, and start it up.

But on 9.6, that leads to an error on the the repurposed primary server (A):

LOG:  primary server contains no more WAL on requested timeline 1
LOG:  new timeline 2 forked off current database system timeline 1 
before current recovery point 0/30000A0

It's not clear to me why that is. It seems that the primary generates 
some WAL at shutdown that doesn't get replicated, before the shutdown 
happens. Or the standby doesn't replay that WAL before it's promoted. 
But we have supported "clean switchover" since 9.1, see commit 
985bd7d497. When you shut down the primary, it should wait until all the 
WAL has been replicated, including the shutdown checkpoint.

Perhaps I was just doing it wrong in the test. Or maybe there's a 
genuine bug in that that was fixed in v10. I worked around that in the 
test by re-initializing the primary standby from backup instead of just 
reconfiguring it as a standby, and that's good enough for this 
particular test, so I'm not planning to dig deeper into that myself.

- Heikki



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] Add support for leading/trailing bytea trim()ing
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Covering SPGiST index