Re: WAL segments removed from primary despite the fact that logical replication slot needs it. - Mailing list pgsql-bugs

Hi,

On Thu, Dec 8, 2022 at 8:13 PM hubert depesz lubaczewski
<depesz@depesz.com> wrote:
>
> Hi,
> just checking - has there been any progress on diagnosing/fixing the
> bug?

Sorry for the late response.

Based on the analysis we did[1][2], I've created the manual scenario
to reproduce this issue with the attached patch and the script.

The scenario.md explains the basic steps to reproduce this issue. It
consists of 13 steps (very tricky!!). It's not sophisticated and could
be improved. test.sh is the shell script I used to execute the
reproduction steps from 1 to 10. In my environment, I could reproduce
this issue by the following steps.

1. apply the patch and build PostgreSQL.
2. run test.sh.
3. execute the step 11 and later described in scenario.md.

The test.sh is a very hacky and dirty script and is optimized in my
environment (especially adding many sleeps). You might need to adjust
it while checking scenario.md.

I've also confirmed that this issue is fixed by the attached patch,
which clears candidate_restart_lsn and friends during
ReplicationSlotRelease().

[1] https://www.postgresql.org/message-id/CAA4eK1JvyWHzMwhO9jzPquctE_ha6bz3EkB3KE6qQJx63StErQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAD21AoBHMCEDcV4eBtSVvDrCN4SrMXanX5N9%2BL-E%2B4OWXYY0ew%40mail.gmail.com

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-bugs by date:

Previous
From: Mats Kindahl
Date:
Subject: Re: Crash during backend start when low on memory
Next
From: "Qu, Mischa, Majorel China"
Date:
Subject: exceptional result of postres_fdw external table joining local table