Thread: Assertion failure in SnapBuildInitialSnapshot()

Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

10 November 2022, 10:34:40

Hi,

Thomas has reported this failure in an email [1] and shared the
following links offlist with me:
https://cirrus-ci.com/task/5311549010083840

https://api.cirrus-ci.com/v1/artifact/task/5311549010083840/testrun/build/testrun/subscription/100_bugs/log/100_bugs_twoways.log

https://api.cirrus-ci.com/v1/artifact/task/5311549010083840/crashlog/crashlog-postgres.exe_1c40_2022-11-08_00-20-28-110.txt

The call stack is as follows:
00000063`4edff670 00007ff7`1922fcdf     postgres!ExceptionalCondition(
char * conditionName = 0x00007ff7`198f8050
"TransactionIdPrecedesOrEquals(safeXid, snap->xmin)",
char * fileName = 0x00007ff7`198f8020
"../src/backend/replication/logical/snapbuild.c",
int lineNumber = 0n600)+0x78 [c:\cirrus\src\backend\utils\error\assert.c @ 67]
00000063`4edff6b0 00007ff7`192106df     postgres!SnapBuildInitialSnapshot(
struct SnapBuild * builder = 0x00000251`5b95bce8)+0x20f
[c:\cirrus\src\backend\replication\logical\snapbuild.c @ 600]
00000063`4edff730 00007ff7`1920d9f6     postgres!CreateReplicationSlot(
struct CreateReplicationSlotCmd * cmd = 0x00000251`5b94d828)+0x40f
[c:\cirrus\src\backend\replication\walsender.c @ 1152]
00000063`4edff870 00007ff7`192bc9c4     postgres!exec_replication_command(
char * cmd_string = 0x00000251`5b94ac68 "CREATE_REPLICATION_SLOT
"pg_16400_sync_16392_7163433409941550636" LOGICAL pgoutput (SNAPSHOT
'use')")+0x4a6 [c:\cirrus\src\backend\replication\walsender.c @ 1804]


As per my investigation based on the above logs, the failed test is
due to the following command in 100_bugs.pl:
$node_twoways->safe_psql('d2',
    "CREATE SUBSCRIPTION testsub CONNECTION \$\$"
  . $node_twoways->connstr('d1')
  . "\$\$ PUBLICATION testpub WITH (create_slot=false, "
  . "slot_name='testslot')");

It failed while creating the table sync slot.

The failure happens because the xmin computed by the snap builder is
lesser than what is computed by GetOldestSafeDecodingTransactionId()
during initial snapshot creation for the tablesync slot by
SnapBuildInitialSnapshot.

To investigate, I tried to study how the values of "safeXid" and
"snap->xmin" are computed in SnapBuildInitialSnapshot(). There appear
to be four places in the code where we assign value to xmin
(builder-xmin) during the snapshot building process and then we assign
the same to snap->xmin. Those places are: (a) Two places in
SnapBuildFindSnapshot(), (b) One place in SnapBuildRestore(), and (c)
One place in SnapBuildProcessRunningXacts()

Seeing the LOGS, it appears to me that we find a consistent point from
the below code in SnapBuildFindSnapshot() and the following line
assigns builder->xmin.

...
if (running->oldestRunningXid == running->nextXid)
{
...
builder->xmin = running->nextXid;

The reason is we only see "logical decoding found consistent point at
..." in LOGs. If SnapBuildFindSnapshot() has to go through the entire
machinery of snapshot building then, we should have seen "logical
decoding found initial starting point at ..." and similar other LOGs.
The builder->xmin can't be assigned from any other place in (b) or
(c). From (c), it can't be assigned because we are building a full
snapshot here which won't restore any serialized snapshot. It can't be
assigned from (b) because while creating a slot we stop as soon as we
find the consistent point, see
DecodingContextFindStartpoint()->DecodingContextReady()

In the above code snippet, the running->nextXid in the above code
snippet has been assigned from ShmemVariableCache->nextXid in
GetRunningTransactionData().

The safeXid computed from GetOldestSafeDecodingTransactionId() uses
ShmemVariableCache->nextXid as the starting point and keeps the slot's
xmin as the safe Xid limit.

It seems to me due to SnapBuilder's initial_xmin_horizon, we won't set
(SnapBuilder's) xmin lesser than slot's effective_xmin computed in
CreateInitDecodingContext(). See SnapBuildFindSnapshot(). So, ideally,
SnapBuildInitialSnapshot() should never compute safeXid which is based
on the minimum of all slot's effective_xmin to be greater than
SnapBuilder's xmin (or snapshot's xmin). In short, the code as written
seems correct to me.

I have also tried to analyze if any recent commit (7f13ac8123) could
cause this problem but can't think of any reason because the changes
are related to the restart of decoding and the failed test is related
to creating a slot the very first time.

I don't have any good ideas on how to proceed with this. Any thoughts
on this would be helpful?

Note: I have briefly discussed this issue with Sawada-San and
Kuroda-San, so kept them in Cc.

[1] -
https://www.postgresql.org/message-id/CA%2BhUKG%2BA_LyW%3DFAOi2ebA9Vr-1%3Desu%2BeBSm0dsVhU%3DEgfpipkg%40mail.gmail.com

--
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

15 November 2022, 01:25:31

Hi,

On 2022-11-10 16:04:40 +0530, Amit Kapila wrote:
> I don't have any good ideas on how to proceed with this. Any thoughts
> on this would be helpful?

One thing worth doing might be to convert the assertion path into an elog(),
mentioning both xids (or add a framework for things like AssertLT(), but that
seems hard). With the concrete values we could make a better guess at what's
going wrong.

It'd probably not hurt to just perform this check independent of
USE_ASSERT_CHECKING - compared to the cost of creating a slot it's neglegible.

That'll obviously only help us whenever we re-encounter the issue, which will
likely be a while...

Have you tried reproducing the issue by running the test in a loop?

One thing I noticed just now is that we don't assert
builder->building_full_snapshot==true. I think we should? That didn't use to
be an option, but now it is... It doesn't look to me like that's the issue,
but ...

Hm, also, shouldn't the patch adding CRS_USE_SNAPSHOT have copied more of
SnapBuildExportSnapshot()? Why aren't the error checks for
SnapBuildExportSnapshot() needed? Why don't we need to set XactReadOnly? Which
transactions are we even in when we import the snapshot (cf.
SnapBuildExportSnapshot() doing a StartTransactionCommand()).

I'm also somewhat suspicious of calling RestoreTransactionSnapshot() with
source=MyProc. Looks like it works, but it'd be pretty easy to screw up, and
there's no comments in SetTransactionSnapshot() or
ProcArrayInstallRestoredXmin() warning that that might be the case.

Greetings,

Andres Freund

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

15 November 2022, 02:38:37

Hi,

On 2022-11-14 17:25:31 -0800, Andres Freund wrote:
> Hm, also, shouldn't the patch adding CRS_USE_SNAPSHOT have copied more of
> SnapBuildExportSnapshot()? Why aren't the error checks for
> SnapBuildExportSnapshot() needed? Why don't we need to set XactReadOnly? Which
> transactions are we even in when we import the snapshot (cf.
> SnapBuildExportSnapshot() doing a StartTransactionCommand()).

Most of the checks for that are in CreateReplicationSlot() - but not al,
e.g. XactReadOnly isn't set, nor do we enforce in an obvious place that we
don't already hold a snapshot.

I first thought this might directly explain the problem, due to the
MyProc->xmin assignment in SnapBuildInitialSnapshot() overwriting a value that
could influence the return value for GetOldestSafeDecodingTransactionId(). But
that happens later, and we check that MyProc->xmin is invalid at the start.

But it still seems supicious. This will e.g. influence whether
StartupDecodingContext() sets PROC_IN_LOGICAL_DECODING. Which probably is
fine, but...

Another theory: I dimly remember Thomas mentioning that there's some different
behaviour of xlogreader during shutdown as part of the v15 changes. I don't
quite remember what the scenario leading up to that was. Thomas?

It's certainly interesting that we see stuff like:

2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] ERROR:  could not find
recordwhile sending logically-decoded data: missing contrecord at 0/1D3B710

2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] STATEMENT:
START_REPLICATIONSLOT "pg_16400_sync_16395_7163433409941550636" LOGICAL 0/1D2B650 (proto_version '3', origin 'any',
publication_names'"testpub"')

ERROR:  could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710
2022-11-08 00:20:23.255 GMT [248][logical replication worker] ERROR:  error while shutting down streaming COPY: ERROR:
couldnot find record while sending logically-decoded data: missing contrecord at 0/1D3B710

It could entirely be caused by postmaster slowly killing processes after the
assertion failure and that that is corrupting shared memory state though. But
it might also be related.

Greetings,

Andres Freund

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Thomas Munro

Date:

15 November 2022, 03:26:50

On Tue, Nov 15, 2022 at 3:38 PM Andres Freund <andres@anarazel.de> wrote:
> On 2022-11-14 17:25:31 -0800, Andres Freund wrote:
> Another theory: I dimly remember Thomas mentioning that there's some different
> behaviour of xlogreader during shutdown as part of the v15 changes. I don't
> quite remember what the scenario leading up to that was. Thomas?

Yeah.  So as mentioned in:

https://www.postgresql.org/message-id/CA%2BhUKG%2BWKsZpdoryeqM8_rk5uQPCqS2HGY92WiMGFsK2wVkcig%40mail.gmail.com

I still have on my list to remove a new "missing contrecord" error
that can show up in a couple of different scenarios that aren't
exactly error conditions depending on how you think about it, but I
haven't done that yet.  I am not currently aware of anything bad
happening because of those messages, but I could be wrong.

> It's certainly interesting that we see stuff like:
>
> 2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] ERROR:  could not find
recordwhile sending logically-decoded data: missing contrecord at 0/1D3B710

> 2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] STATEMENT:
START_REPLICATIONSLOT "pg_16400_sync_16395_7163433409941550636" LOGICAL 0/1D2B650 (proto_version '3', origin 'any',
publication_names'"testpub"')

> ERROR:  could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710
> 2022-11-08 00:20:23.255 GMT [248][logical replication worker] ERROR:  error while shutting down streaming COPY:
ERROR: could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710

Right, so that might fit the case described in my email above:
logical_read_xlog_page() notices that it has been asked to shut down
when it is between reads of pages with a spanning contrecord.  Before,
it would fail silently, so XLogReadRecord() returns NULL without
setting *errmsg, but now it complains about a missing contrecord.  In
the case where it was showing up on that other thread, just a few
machines often seemed to log that error when shutting down --
peripatus for example -- I don't know why, but I assume something to
do with shutdown timing and page alignment.

> It could entirely be caused by postmaster slowly killing processes after the
> assertion failure and that that is corrupting shared memory state though. But
> it might also be related.

Hmm.

RE: Assertion failure in SnapBuildInitialSnapshot()

From

"Takamichi Osumi (Fujitsu)"

Date:

15 November 2022, 05:15:48

Hi,


On Tuesday, November 15, 2022 10:26 AM Andres Freund <andres@anarazel.de> wrote:
> On 2022-11-10 16:04:40 +0530, Amit Kapila wrote:
> > I don't have any good ideas on how to proceed with this. Any thoughts
> > on this would be helpful?
>
> One thing worth doing might be to convert the assertion path into an elog(),
> mentioning both xids (or add a framework for things like AssertLT(), but that
> seems hard). With the concrete values we could make a better guess at
> what's going wrong.
>
> It'd probably not hurt to just perform this check independent of
> USE_ASSERT_CHECKING - compared to the cost of creating a slot it's
> neglegible.
>
> That'll obviously only help us whenever we re-encounter the issue, which will
> likely be a while...
>
>
> Have you tried reproducing the issue by running the test in a loop?
Just FYI, I've tried to reproduce this failure in a loop,
but I couldn't hit the same assertion failure.


I extracted the #16643 of 100_bugs.pl only and
executed the tests more than 500 times.


My env and test was done in rhel7.9 and gcc 4.8 with configure option of
--enable-cassert --enable-debug --enable-tap-tests --with-icu CFLAGS=-O0 and prefix.



Best Regards,
    Takamichi Osumi

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

15 November 2022, 10:50:00

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-14 17:25:31 -0800, Andres Freund wrote:
> > Hm, also, shouldn't the patch adding CRS_USE_SNAPSHOT have copied more of
> > SnapBuildExportSnapshot()? Why aren't the error checks for
> > SnapBuildExportSnapshot() needed? Why don't we need to set XactReadOnly? Which
> > transactions are we even in when we import the snapshot (cf.
> > SnapBuildExportSnapshot() doing a StartTransactionCommand()).
>
> Most of the checks for that are in CreateReplicationSlot() - but not al,
> e.g. XactReadOnly isn't set,
>

Yeah, I think we can add the check for XactReadOnly along with other
checks in CreateReplicationSlot().

> nor do we enforce in an obvious place that we
> don't already hold a snapshot.
>

We have a check for (FirstXactSnapshot == NULL) in
RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
sufficient?

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

15 November 2022, 11:51:44

On Tue, Nov 15, 2022 at 6:55 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-10 16:04:40 +0530, Amit Kapila wrote:
> > I don't have any good ideas on how to proceed with this. Any thoughts
> > on this would be helpful?
>
> One thing worth doing might be to convert the assertion path into an elog(),
> mentioning both xids (or add a framework for things like AssertLT(), but that
> seems hard). With the concrete values we could make a better guess at what's
> going wrong.
>
> It'd probably not hurt to just perform this check independent of
> USE_ASSERT_CHECKING - compared to the cost of creating a slot it's neglegible.
>
> That'll obviously only help us whenever we re-encounter the issue, which will
> likely be a while...
>

Agreed.

>
>
> One thing I noticed just now is that we don't assert
> builder->building_full_snapshot==true. I think we should? That didn't use to
> be an option, but now it is... It doesn't look to me like that's the issue,
> but ...
>

Agreed.

The attached patch contains both changes. It seems to me this issue
can happen, if somehow, either slot's effective_xmin increased after
we assign its initial value in CreateInitDecodingContext() or somehow
its value is InvalidTransactionId when we have invoked
SnapBuildInitialSnapshot(). The other possibility is that the
initial_xmin_horizon check in SnapBuildFindSnapshot() doesn't insulate
us from assigning builder->xmin value older than initial_xmin_horizon.
I am not able to see if any of this can be true.

-- 
With Regards,
Amit Kapila.

Attachment

improve_checks_snapbuild_1.patch

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

16 November 2022, 02:00:48

Hi,

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > nor do we enforce in an obvious place that we
> > don't already hold a snapshot.
> >
> 
> We have a check for (FirstXactSnapshot == NULL) in
> RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
> sufficient?

I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
still be bad. And I think checking in SetTransactionSnapshot() is too late,
we've already overwritten MyProc->xmin by that point.


On 2022-11-15 17:21:44 +0530, Amit Kapila wrote:
> > One thing I noticed just now is that we don't assert
> > builder->building_full_snapshot==true. I think we should? That didn't use to
> > be an option, but now it is... It doesn't look to me like that's the issue,
> > but ...
> >
> 
> Agreed.
> 
> The attached patch contains both changes. It seems to me this issue
> can happen, if somehow, either slot's effective_xmin increased after
> we assign its initial value in CreateInitDecodingContext() or somehow
> its value is InvalidTransactionId when we have invoked
> SnapBuildInitialSnapshot(). The other possibility is that the
> initial_xmin_horizon check in SnapBuildFindSnapshot() doesn't insulate
> us from assigning builder->xmin value older than initial_xmin_horizon.
> I am not able to see if any of this can be true.

Yea, I don't immediately see anything either. Given the discussion in
https://www.postgresql.org/message-id/Yz2hivgyjS1RfMKs%40depesz.com I am
starting to wonder if we've introduced a race in the slot machinery.


> diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
> index 5006a5c464..e85c75e0e6 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -566,11 +566,13 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
>  {
>      Snapshot    snap;
>      TransactionId xid;
> +    TransactionId safeXid;
>      TransactionId *newxip;
>      int            newxcnt = 0;
>  
>      Assert(!FirstSnapshotSet);
>      Assert(XactIsoLevel == XACT_REPEATABLE_READ);
> +    Assert(builder->building_full_snapshot);
>  
>      if (builder->state != SNAPBUILD_CONSISTENT)
>          elog(ERROR, "cannot build an initial slot snapshot before reaching a consistent state");
> @@ -589,17 +591,13 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
>       * mechanism. Due to that we can do this without locks, we're only
>       * changing our own value.
>       */

Perhaps add something like "Creating a snapshot is expensive and an unenforced
xmin horizon would have bad consequences, therefore always double-check that
the horizon is enforced"?


> -#ifdef USE_ASSERT_CHECKING
> -    {
> -        TransactionId safeXid;
> -
> -        LWLockAcquire(ProcArrayLock, LW_SHARED);
> -        safeXid = GetOldestSafeDecodingTransactionId(false);
> -        LWLockRelease(ProcArrayLock);
> +    LWLockAcquire(ProcArrayLock, LW_SHARED);
> +    safeXid = GetOldestSafeDecodingTransactionId(false);
> +    LWLockRelease(ProcArrayLock);
>  
> -        Assert(TransactionIdPrecedesOrEquals(safeXid, snap->xmin));
> -    }
> -#endif
> +    if (TransactionIdFollows(safeXid, snap->xmin))
> +        elog(ERROR, "cannot build an initial slot snapshot when oldest safe xid %u follows snapshot's xmin %u",
> +             safeXid, snap->xmin);
>  
>      MyProc->xmin = snap->xmin;
>  

s/when/as/

Greetings,

Andres Freund

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

16 November 2022, 08:52:01

On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> > On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > > nor do we enforce in an obvious place that we
> > > don't already hold a snapshot.
> > >
> >
> > We have a check for (FirstXactSnapshot == NULL) in
> > RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
> > sufficient?
>
> I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
> still be bad. And I think checking in SetTransactionSnapshot() is too late,
> we've already overwritten MyProc->xmin by that point.
>

So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
we have the Assert for Assert(!FirstSnapshotSet)?
Assert(FirstXactSnapshot == NULL);
Assert(!HistoricSnapshotActive());

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

16 November 2022, 18:26:49

Hi,

On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:
> On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> > > On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > > > nor do we enforce in an obvious place that we
> > > > don't already hold a snapshot.
> > > >
> > >
> > > We have a check for (FirstXactSnapshot == NULL) in
> > > RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
> > > sufficient?
> >
> > I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
> > still be bad. And I think checking in SetTransactionSnapshot() is too late,
> > we've already overwritten MyProc->xmin by that point.
> >
> 
> So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
> we have the Assert for Assert(!FirstSnapshotSet)?
> Assert(FirstXactSnapshot == NULL);
> Assert(!HistoricSnapshotActive());

I don't think that'd catch a catalog snapshot. But perhaps the better answer
for the catalog snapshot is to just invalidate it explicitly. The user doesn't
have control over the catalog snapshot being taken, and it's not too hard to
imagine the walsender code triggering one somewhere.

So maybe we should add something like:

InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
if (HaveRegisteredOrActiveSnapshot())
  elog(ERROR, "cannot build an initial slot snapshot when snapshots exist")
Assert(!HistoricSnapshotActive());

I think we'd not need to assert FirstXactSnapshot == NULL or !FirstSnapshotSet
with that, because those would show up in HaveRegisteredOrActiveSnapshot().

Greetings,

Andres Freund

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

17 November 2022, 05:14:18

On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:
> > On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> > > > On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > nor do we enforce in an obvious place that we
> > > > > don't already hold a snapshot.
> > > > >
> > > >
> > > > We have a check for (FirstXactSnapshot == NULL) in
> > > > RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
> > > > sufficient?
> > >
> > > I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
> > > still be bad. And I think checking in SetTransactionSnapshot() is too late,
> > > we've already overwritten MyProc->xmin by that point.
> > >
> >
> > So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
> > we have the Assert for Assert(!FirstSnapshotSet)?
> > Assert(FirstXactSnapshot == NULL);
> > Assert(!HistoricSnapshotActive());
>
> I don't think that'd catch a catalog snapshot. But perhaps the better answer
> for the catalog snapshot is to just invalidate it explicitly. The user doesn't
> have control over the catalog snapshot being taken, and it's not too hard to
> imagine the walsender code triggering one somewhere.
>
> So maybe we should add something like:
>
> InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
>

The comment "/* about to overwrite MyProc->xmin */" is unclear to me.
We already have a check (/* so we don't overwrite the existing value
*/
if (TransactionIdIsValid(MyProc->xmin))) in SnapBuildInitialSnapshot()
which ensures that we don't overwrite MyProc->xmin, so the above
comment seems contradictory to me.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

17 November 2022, 06:33:40

On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:
> > On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> > > > On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > nor do we enforce in an obvious place that we
> > > > > don't already hold a snapshot.
> > > > >
> > > >
> > > > We have a check for (FirstXactSnapshot == NULL) in
> > > > RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
> > > > sufficient?
> > >
> > > I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
> > > still be bad. And I think checking in SetTransactionSnapshot() is too late,
> > > we've already overwritten MyProc->xmin by that point.
> > >
> >
> > So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
> > we have the Assert for Assert(!FirstSnapshotSet)?
> > Assert(FirstXactSnapshot == NULL);
> > Assert(!HistoricSnapshotActive());
>
> I don't think that'd catch a catalog snapshot. But perhaps the better answer
> for the catalog snapshot is to just invalidate it explicitly. The user doesn't
> have control over the catalog snapshot being taken, and it's not too hard to
> imagine the walsender code triggering one somewhere.
>
> So maybe we should add something like:
>
> InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
> if (HaveRegisteredOrActiveSnapshot())
>   elog(ERROR, "cannot build an initial slot snapshot when snapshots exist")
> Assert(!HistoricSnapshotActive());
>
> I think we'd not need to assert FirstXactSnapshot == NULL or !FirstSnapshotSet
> with that, because those would show up in HaveRegisteredOrActiveSnapshot().
>

In the attached patch, I have incorporated this change and other
feedback. I think this should probably help us find the reason for
this problem when we see it in the future.

-- 
With Regards,
Amit Kapila.

Attachment

v2-0001-Add-additional-checks-while-creating-the-initial-.patch

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

17 November 2022, 17:45:07

Hi,

On 2022-11-17 10:44:18 +0530, Amit Kapila wrote:
> On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:
> > On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:
> > > On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:
> > > > On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> > > > > On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > I don't think that'd catch a catalog snapshot. But perhaps the better answer
> > for the catalog snapshot is to just invalidate it explicitly. The user doesn't
> > have control over the catalog snapshot being taken, and it's not too hard to
> > imagine the walsender code triggering one somewhere.
> >
> > So maybe we should add something like:
> >
> > InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
> >
> 
> The comment "/* about to overwrite MyProc->xmin */" is unclear to me.
> We already have a check (/* so we don't overwrite the existing value
> */
> if (TransactionIdIsValid(MyProc->xmin))) in SnapBuildInitialSnapshot()
> which ensures that we don't overwrite MyProc->xmin, so the above
> comment seems contradictory to me.

The point is that catalog snapshots could easily end up setting MyProc->xmin,
even though the caller hasn't done anything wrong. So the
InvalidateCatalogSnapshot() would avoid erroring out in a number of scenarios.

Greetings,

Andres Freund

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

18 November 2022, 05:50:36

On Thu, Nov 17, 2022 at 11:15 PM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-17 10:44:18 +0530, Amit Kapila wrote:
> > On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:
> > > On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:
> > > > On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:
> > > > > On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:
> > > > > > On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:
> > > I don't think that'd catch a catalog snapshot. But perhaps the better answer
> > > for the catalog snapshot is to just invalidate it explicitly. The user doesn't
> > > have control over the catalog snapshot being taken, and it's not too hard to
> > > imagine the walsender code triggering one somewhere.
> > >
> > > So maybe we should add something like:
> > >
> > > InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
> > >
> >
> > The comment "/* about to overwrite MyProc->xmin */" is unclear to me.
> > We already have a check (/* so we don't overwrite the existing value
> > */
> > if (TransactionIdIsValid(MyProc->xmin))) in SnapBuildInitialSnapshot()
> > which ensures that we don't overwrite MyProc->xmin, so the above
> > comment seems contradictory to me.
>
> The point is that catalog snapshots could easily end up setting MyProc->xmin,
> even though the caller hasn't done anything wrong. So the
> InvalidateCatalogSnapshot() would avoid erroring out in a number of scenarios.
>

Okay, updated the patch accordingly.

-- 
With Regards,
Amit Kapila.

Attachment

v3-0001-Add-additional-checks-while-creating-the-initial-.patch

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

19 November 2022, 01:05:53

Hi,

On 2022-11-18 11:20:36 +0530, Amit Kapila wrote:
> Okay, updated the patch accordingly.

Assuming it passes tests etc, this'd work for me.

- Andres

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

21 November 2022, 07:31:27

On Sat, Nov 19, 2022 at 6:35 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2022-11-18 11:20:36 +0530, Amit Kapila wrote:
> > Okay, updated the patch accordingly.
>
> Assuming it passes tests etc, this'd work for me.
>

Thanks, Pushed.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

08 December 2022, 02:46:44

On Mon, Nov 21, 2022 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Nov 19, 2022 at 6:35 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2022-11-18 11:20:36 +0530, Amit Kapila wrote:
> > > Okay, updated the patch accordingly.
> >
> > Assuming it passes tests etc, this'd work for me.
> >
>
> Thanks, Pushed.

The same assertion failure has been reported on another thread[1].
Since I could reproduce this issue several times in my environment
I've investigated the root cause.

I think there is a race condition of updating
procArray->replication_slot_xmin by CreateInitDecodingContext() and
LogicalConfirmReceivedLocation().

What I observed in the test was that a walsender process called:
SnapBuildProcessRunningXacts()
  LogicalIncreaseXminForSlot()
    LogicalConfirmReceivedLocation()
      ReplicationSlotsComputeRequiredXmin(false).

In ReplicationSlotsComputeRequiredXmin() it acquired the
ReplicationSlotControlLock and got 0 as the minimum xmin since there
was no wal sender having effective_xmin. Before calling
ProcArraySetReplicationSlotXmin() (i.e. before acquiring
ProcArrayLock), another walsender process called
CreateInitDecodingContext(), acquired ProcArrayLock, computed
slot->effective_catalog_xmin, called
ReplicationSlotsComputeRequiredXmin(true). Since its
effective_catalog_xmin had been set, it got 39968 as the minimum xmin,
and updated replication_slot_xmin. However, as soon as the second
walsender released ProcArrayLock, the first walsender updated the
replication_slot_xmin to 0. After that, the second walsender called
SnapBuildInitialSnapshot(), and GetOldestSafeDecodingTransactionId()
returned an XID newer than snap->xmin.

One idea to fix this issue is that in
ReplicationSlotsComputeRequiredXmin(), we compute the minimum xmin
while holding both ProcArrayLock and ReplicationSlotControlLock, and
release only ReplicationSlotsControlLock before updating the
replication_slot_xmin. I'm concerned it will increase the contention
on ProcArrayLock but I've attached the patch for discussion.

Regards,

[1] https://www.postgresql.org/message-id/tencent_7EB71DA5D7BA00EB0B429DCE45D0452B6406%40qq.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

fix_slot_xmin_race_condition.patch

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

27 January 2023, 05:24:55

On Thu, Dec 8, 2022 at 8:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> The same assertion failure has been reported on another thread[1].
> Since I could reproduce this issue several times in my environment
> I've investigated the root cause.
>
> I think there is a race condition of updating
> procArray->replication_slot_xmin by CreateInitDecodingContext() and
> LogicalConfirmReceivedLocation().
>
> What I observed in the test was that a walsender process called:
> SnapBuildProcessRunningXacts()
>   LogicalIncreaseXminForSlot()
>     LogicalConfirmReceivedLocation()
>       ReplicationSlotsComputeRequiredXmin(false).
>
> In ReplicationSlotsComputeRequiredXmin() it acquired the
> ReplicationSlotControlLock and got 0 as the minimum xmin since there
> was no wal sender having effective_xmin.
>

What about the current walsender process which is processing
running_xacts via SnapBuildProcessRunningXacts()? Isn't that walsender
slot's effective_xmin have a non-zero value? If not, then why?

-- 
With Regards,
Amit Kapila.

RE: Assertion failure in SnapBuildInitialSnapshot()

From

"Hayato Kuroda (Fujitsu)"

Date:

27 January 2023, 11:01:19

Dear Sawada-san,

Thank you for making the patch! I'm still considering whether this approach is
correct, but I can put a comment to your patch anyway.

```
-    Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));
-
-    if (!already_locked)
-        LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+    Assert(LWLockHeldByMe(ProcArrayLock));
```

In this function, we regard that the ProcArrayLock has been already acquired as
exclusive mode and modify data. I think LWLockHeldByMeInMode() should be used
instead of LWLockHeldByMe().
I confirmed that there is only one caller that uses ReplicationSlotsComputeRequiredXmin(true)
and it acquires exclusive lock correctly, but it can avoid future bug.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Assertion failure in SnapBuildInitialSnapshot()

From

"Hayato Kuroda (Fujitsu)"

Date:

28 January 2023, 14:54:22

Dear Amit, Sawada-san,

I have also reproduced the failure on PG15 with some debug log, and I agreed that
somebody changed procArray->replication_slot_xmin to InvalidTransactionId.

> > The same assertion failure has been reported on another thread[1].
> > Since I could reproduce this issue several times in my environment
> > I've investigated the root cause.
> >
> > I think there is a race condition of updating
> > procArray->replication_slot_xmin by CreateInitDecodingContext() and
> > LogicalConfirmReceivedLocation().
> >
> > What I observed in the test was that a walsender process called:
> > SnapBuildProcessRunningXacts()
> >   LogicalIncreaseXminForSlot()
> >     LogicalConfirmReceivedLocation()
> >       ReplicationSlotsComputeRequiredXmin(false).
> >
> > In ReplicationSlotsComputeRequiredXmin() it acquired the
> > ReplicationSlotControlLock and got 0 as the minimum xmin since there
> > was no wal sender having effective_xmin.
> >
> 
> What about the current walsender process which is processing
> running_xacts via SnapBuildProcessRunningXacts()? Isn't that walsender
> slot's effective_xmin have a non-zero value? If not, then why?

Normal walsenders which are not for tablesync create a replication slot with
NOEXPORT_SNAPSHOT option. I think in this case, CreateInitDecodingContext() is
called with need_full_snapshot = false, and slot->effective_xmin is not updated.
It is set as InvalidTransactionId at ReplicationSlotCreate() and no functions update
that. Hence the slot acquired by the walsender may have Invalid effective_min.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

29 January 2023, 15:45:22

On Sat, Jan 28, 2023 at 11:54 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Amit, Sawada-san,
>
> I have also reproduced the failure on PG15 with some debug log, and I agreed that
> somebody changed procArray->replication_slot_xmin to InvalidTransactionId.
>
> > > The same assertion failure has been reported on another thread[1].
> > > Since I could reproduce this issue several times in my environment
> > > I've investigated the root cause.
> > >
> > > I think there is a race condition of updating
> > > procArray->replication_slot_xmin by CreateInitDecodingContext() and
> > > LogicalConfirmReceivedLocation().
> > >
> > > What I observed in the test was that a walsender process called:
> > > SnapBuildProcessRunningXacts()
> > >   LogicalIncreaseXminForSlot()
> > >     LogicalConfirmReceivedLocation()
> > >       ReplicationSlotsComputeRequiredXmin(false).
> > >
> > > In ReplicationSlotsComputeRequiredXmin() it acquired the
> > > ReplicationSlotControlLock and got 0 as the minimum xmin since there
> > > was no wal sender having effective_xmin.
> > >
> >
> > What about the current walsender process which is processing
> > running_xacts via SnapBuildProcessRunningXacts()? Isn't that walsender
> > slot's effective_xmin have a non-zero value? If not, then why?
>
> Normal walsenders which are not for tablesync create a replication slot with
> NOEXPORT_SNAPSHOT option. I think in this case, CreateInitDecodingContext() is
> called with need_full_snapshot = false, and slot->effective_xmin is not updated.

Right. This is how we create a slot used by an apply worker.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

30 January 2023, 04:57:14

On Sun, Jan 29, 2023 at 9:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Jan 28, 2023 at 11:54 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Amit, Sawada-san,
> >
> > I have also reproduced the failure on PG15 with some debug log, and I agreed that
> > somebody changed procArray->replication_slot_xmin to InvalidTransactionId.
> >
> > > > The same assertion failure has been reported on another thread[1].
> > > > Since I could reproduce this issue several times in my environment
> > > > I've investigated the root cause.
> > > >
> > > > I think there is a race condition of updating
> > > > procArray->replication_slot_xmin by CreateInitDecodingContext() and
> > > > LogicalConfirmReceivedLocation().
> > > >
> > > > What I observed in the test was that a walsender process called:
> > > > SnapBuildProcessRunningXacts()
> > > >   LogicalIncreaseXminForSlot()
> > > >     LogicalConfirmReceivedLocation()
> > > >       ReplicationSlotsComputeRequiredXmin(false).
> > > >
> > > > In ReplicationSlotsComputeRequiredXmin() it acquired the
> > > > ReplicationSlotControlLock and got 0 as the minimum xmin since there
> > > > was no wal sender having effective_xmin.
> > > >
> > >
> > > What about the current walsender process which is processing
> > > running_xacts via SnapBuildProcessRunningXacts()? Isn't that walsender
> > > slot's effective_xmin have a non-zero value? If not, then why?
> >
> > Normal walsenders which are not for tablesync create a replication slot with
> > NOEXPORT_SNAPSHOT option. I think in this case, CreateInitDecodingContext() is
> > called with need_full_snapshot = false, and slot->effective_xmin is not updated.
>
> Right. This is how we create a slot used by an apply worker.
>

I was thinking about how that led to this problem because
GetOldestSafeDecodingTransactionId() ignores InvalidTransactionId. It
seems that is possible when both builder->xmin and
replication_slot_catalog_xmin precede replication_slot_catalog_xmin.
Do you see any different reason for it?

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

30 January 2023, 06:04:11

On Mon, Jan 30, 2023 at 10:27 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Jan 29, 2023 at 9:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Jan 28, 2023 at 11:54 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Dear Amit, Sawada-san,
> > >
> > > I have also reproduced the failure on PG15 with some debug log, and I agreed that
> > > somebody changed procArray->replication_slot_xmin to InvalidTransactionId.
> > >
> > > > > The same assertion failure has been reported on another thread[1].
> > > > > Since I could reproduce this issue several times in my environment
> > > > > I've investigated the root cause.
> > > > >
> > > > > I think there is a race condition of updating
> > > > > procArray->replication_slot_xmin by CreateInitDecodingContext() and
> > > > > LogicalConfirmReceivedLocation().
> > > > >
> > > > > What I observed in the test was that a walsender process called:
> > > > > SnapBuildProcessRunningXacts()
> > > > >   LogicalIncreaseXminForSlot()
> > > > >     LogicalConfirmReceivedLocation()
> > > > >       ReplicationSlotsComputeRequiredXmin(false).
> > > > >
> > > > > In ReplicationSlotsComputeRequiredXmin() it acquired the
> > > > > ReplicationSlotControlLock and got 0 as the minimum xmin since there
> > > > > was no wal sender having effective_xmin.
> > > > >
> > > >
> > > > What about the current walsender process which is processing
> > > > running_xacts via SnapBuildProcessRunningXacts()? Isn't that walsender
> > > > slot's effective_xmin have a non-zero value? If not, then why?
> > >
> > > Normal walsenders which are not for tablesync create a replication slot with
> > > NOEXPORT_SNAPSHOT option. I think in this case, CreateInitDecodingContext() is
> > > called with need_full_snapshot = false, and slot->effective_xmin is not updated.
> >
> > Right. This is how we create a slot used by an apply worker.
> >
>
> I was thinking about how that led to this problem because
> GetOldestSafeDecodingTransactionId() ignores InvalidTransactionId.
>

I have reproduced it manually. For this, I had to manually make the
debugger call ReplicationSlotsComputeRequiredXmin(false) via path
SnapBuildProcessRunningXacts()->LogicalIncreaseXminForSlot()->LogicalConfirmReceivedLocation()
->ReplicationSlotsComputeRequiredXmin(false) for the apply worker. The
sequence of events is something like (a) the replication_slot_xmin for
tablesync worker is overridden by apply worker as zero as explained in
Sawada-San's email, (b) another transaction happened on the publisher
that will increase the value of ShmemVariableCache->nextXid (c)
tablesync worker invokes
SnapBuildInitialSnapshot()->GetOldestSafeDecodingTransactionId() which
will return an oldestSafeXid which is higher than snapshot's xmin.
This happens because replication_slot_xmin has an InvalidTransactionId
value and we won't consider replication_slot_catalog_xmin because
catalogOnly flag is false and there is no other open running
transaction. I think we should try to get a simplified test to
reproduce this problem if possible.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

30 January 2023, 11:24:46

On Thu, Dec 8, 2022 at 8:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Nov 21, 2022 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> One idea to fix this issue is that in
> ReplicationSlotsComputeRequiredXmin(), we compute the minimum xmin
> while holding both ProcArrayLock and ReplicationSlotControlLock, and
> release only ReplicationSlotsControlLock before updating the
> replication_slot_xmin. I'm concerned it will increase the contention
> on ProcArrayLock but I've attached the patch for discussion.
>

But what kind of workload are you worried about? This will be called
while processing XLOG_RUNNING_XACTS to update
procArray->replication_slot_xmin/procArray->replication_slot_catalog_xmin
only when required. So, if we want we can test some concurrent
workloads along with walsenders doing the decoding to check if it
impacts performance.

What other way we can fix this? Do you think we can try to avoid
retreating xmin values in ProcArraySetReplicationSlotXmin() to avoid
this problem? Personally, I think taking the lock as proposed by your
patch is a better idea. BTW, this problem seems to be only logical
replication specific, so if we are too worried then we can change this
locking only for logical replication.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

30 January 2023, 11:27:19

On Mon, Jan 30, 2023 at 11:34 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I have reproduced it manually. For this, I had to manually make the
> debugger call ReplicationSlotsComputeRequiredXmin(false) via path
> SnapBuildProcessRunningXacts()->LogicalIncreaseXminForSlot()->LogicalConfirmReceivedLocation()
> ->ReplicationSlotsComputeRequiredXmin(false) for the apply worker. The
> sequence of events is something like (a) the replication_slot_xmin for
> tablesync worker is overridden by apply worker as zero as explained in
> Sawada-San's email, (b) another transaction happened on the publisher
> that will increase the value of ShmemVariableCache->nextXid (c)
> tablesync worker invokes
> SnapBuildInitialSnapshot()->GetOldestSafeDecodingTransactionId() which
> will return an oldestSafeXid which is higher than snapshot's xmin.
> This happens because replication_slot_xmin has an InvalidTransactionId
> value and we won't consider replication_slot_catalog_xmin because
> catalogOnly flag is false and there is no other open running
> transaction. I think we should try to get a simplified test to
> reproduce this problem if possible.
>

Here are steps to reproduce it manually with the help of a debugger:

Session-1
==========
select pg_create_logical_replication_slot('s', 'test_decoding');
create table t2(c1 int);
select pg_replication_slot_advance('s', pg_current_wal_lsn()); --
Debug this statement. Stop before taking procarraylock in
ProcArraySetReplicationSlotXmin.

Session-2
============
psql -d postgres
Begin;

Session-3
===========
psql -d "dbname=postgres replication=database"

begin transaction isolation level repeatable read read only;
CREATE_REPLICATION_SLOT slot1 LOGICAL test_decoding USE_SNAPSHOT;
--Debug this statement. Stop in SnapBuildInitialSnapshot() before
taking procarraylock

Session-1
==========
Continue debugging and finish execution of
ProcArraySetReplicationSlotXmin. Verify
procArray->replication_slot_xmin is zero.

Session-2
=========
Select txid_current();
Commit;

Session-3
==========
Continue debugging.
Verify that safeXid follows snap->xmin. This leads to assertion (in
back branches) or error (in HEAD).

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

30 January 2023, 11:29:48

On Fri, Jan 27, 2023 at 4:31 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Thank you for making the patch! I'm still considering whether this approach is
> correct, but I can put a comment to your patch anyway.
>
> ```
> -       Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));
> -
> -       if (!already_locked)
> -               LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
> +       Assert(LWLockHeldByMe(ProcArrayLock));
> ```
>
> In this function, we regard that the ProcArrayLock has been already acquired as
> exclusive mode and modify data. I think LWLockHeldByMeInMode() should be used
> instead of LWLockHeldByMe().
>

Right, this is even evident from the comments atop
ReplicationSlotsComputeRequiredXmin("If already_locked is true,
ProcArrayLock has already been acquired exclusively.". But, I am not
sure if it is a good idea to remove 'already_locked' parameter,
especially in back branches as this is an exposed API.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

30 January 2023, 12:41:20

On Mon, Jan 30, 2023 at 8:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 27, 2023 at 4:31 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Thank you for making the patch! I'm still considering whether this approach is
> > correct, but I can put a comment to your patch anyway.
> >
> > ```
> > -       Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));
> > -
> > -       if (!already_locked)
> > -               LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
> > +       Assert(LWLockHeldByMe(ProcArrayLock));
> > ```
> >
> > In this function, we regard that the ProcArrayLock has been already acquired as
> > exclusive mode and modify data. I think LWLockHeldByMeInMode() should be used
> > instead of LWLockHeldByMe().
> >
>
> Right, this is even evident from the comments atop
> ReplicationSlotsComputeRequiredXmin("If already_locked is true,
> ProcArrayLock has already been acquired exclusively.".

Agreed, will fix in the next version patch.

> But, I am not
> sure if it is a good idea to remove 'already_locked' parameter,
> especially in back branches as this is an exposed API.

Yes, we should not remove the already_locked parameter in
backbranches. So I was thinking of keeping it on back branches.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

31 January 2023, 01:19:13

On Mon, Jan 30, 2023 at 8:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 8, 2022 at 8:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Nov 21, 2022 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > One idea to fix this issue is that in
> > ReplicationSlotsComputeRequiredXmin(), we compute the minimum xmin
> > while holding both ProcArrayLock and ReplicationSlotControlLock, and
> > release only ReplicationSlotsControlLock before updating the
> > replication_slot_xmin. I'm concerned it will increase the contention
> > on ProcArrayLock but I've attached the patch for discussion.
> >
>
> But what kind of workload are you worried about? This will be called
> while processing XLOG_RUNNING_XACTS to update
> procArray->replication_slot_xmin/procArray->replication_slot_catalog_xmin
> only when required. So, if we want we can test some concurrent
> workloads along with walsenders doing the decoding to check if it
> impacts performance.
>

I was slightly concerned about holding ProcArrayLock while iterating
over replication slots especially when there are many replication
slots in the system. But you're right; we need it only when processing
XLOG_RUNINNG_XACTS and it's not frequent. So it doesn't introduce
visible overhead or negligible overhead.

> What other way we can fix this? Do you think we can try to avoid
> retreating xmin values in ProcArraySetReplicationSlotXmin() to avoid
> this problem? Personally, I think taking the lock as proposed by your
> patch is a better idea.

Agreed.

> BTW, this problem seems to be only logical
> replication specific, so if we are too worried then we can change this
> locking only for logical replication.

Yes, but I agree that there won't be a big overhead by this fix.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

31 January 2023, 05:41:35

On Mon, Jan 30, 2023 at 9:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 30, 2023 at 8:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 27, 2023 at 4:31 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Thank you for making the patch! I'm still considering whether this approach is
> > > correct, but I can put a comment to your patch anyway.
> > >
> > > ```
> > > -       Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));
> > > -
> > > -       if (!already_locked)
> > > -               LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
> > > +       Assert(LWLockHeldByMe(ProcArrayLock));
> > > ```
> > >
> > > In this function, we regard that the ProcArrayLock has been already acquired as
> > > exclusive mode and modify data. I think LWLockHeldByMeInMode() should be used
> > > instead of LWLockHeldByMe().
> > >
> >
> > Right, this is even evident from the comments atop
> > ReplicationSlotsComputeRequiredXmin("If already_locked is true,
> > ProcArrayLock has already been acquired exclusively.".
>
> Agreed, will fix in the next version patch.
>
> > But, I am not
> > sure if it is a good idea to remove 'already_locked' parameter,
> > especially in back branches as this is an exposed API.
>
> Yes, we should not remove the already_locked parameter in
> backbranches. So I was thinking of keeping it on back branches.
>

I've attached patches for HEAD and backbranches. Please review them.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

31 January 2023, 06:55:53

On Tue, Jan 31, 2023 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 30, 2023 at 9:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached patches for HEAD and backbranches. Please review them.
>

Shall we add a comment like the one below in
ReplicationSlotsComputeRequiredXmin()?
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index f286918f69..e28d48bca7 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -840,6 +840,13 @@ ReplicationSlotsComputeRequiredXmin(bool already_locked)

        Assert(ReplicationSlotCtl != NULL);

+       /*
+        * It is possible that by the time we compute the agg_xmin
here and before
+        * updating replication_slot_xmin, the CreateInitDecodingContext() will
+        * compute and update replication_slot_xmin. So, we need to acquire
+        * ProcArrayLock here to avoid retreating the value of
replication_slot_xmin.
+        */
+

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

31 January 2023, 06:59:38

On Tue, Jan 31, 2023 at 3:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 31, 2023 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 9:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached patches for HEAD and backbranches. Please review them.
> >
>
> Shall we add a comment like the one below in
> ReplicationSlotsComputeRequiredXmin()?
> diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> index f286918f69..e28d48bca7 100644
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -840,6 +840,13 @@ ReplicationSlotsComputeRequiredXmin(bool already_locked)
>
>         Assert(ReplicationSlotCtl != NULL);
>
> +       /*
> +        * It is possible that by the time we compute the agg_xmin
> here and before
> +        * updating replication_slot_xmin, the CreateInitDecodingContext() will
> +        * compute and update replication_slot_xmin. So, we need to acquire
> +        * ProcArrayLock here to avoid retreating the value of
> replication_slot_xmin.
> +        */
> +

Agreed. It looks good to me.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

31 January 2023, 12:37:58

On Tue, Jan 31, 2023 at 3:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 31, 2023 at 3:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jan 31, 2023 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Jan 30, 2023 at 9:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached patches for HEAD and backbranches. Please review them.
> > >
> >
> > Shall we add a comment like the one below in
> > ReplicationSlotsComputeRequiredXmin()?
> > diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> > index f286918f69..e28d48bca7 100644
> > --- a/src/backend/replication/slot.c
> > +++ b/src/backend/replication/slot.c
> > @@ -840,6 +840,13 @@ ReplicationSlotsComputeRequiredXmin(bool already_locked)
> >
> >         Assert(ReplicationSlotCtl != NULL);
> >
> > +       /*
> > +        * It is possible that by the time we compute the agg_xmin
> > here and before
> > +        * updating replication_slot_xmin, the CreateInitDecodingContext() will
> > +        * compute and update replication_slot_xmin. So, we need to acquire
> > +        * ProcArrayLock here to avoid retreating the value of
> > replication_slot_xmin.
> > +        */
> > +
>
> Agreed. It looks good to me.

Attached updated patches.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

01 February 2023, 05:53:57

On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Attached updated patches.
>

Thanks, Andres, others, do you see a better way to fix this problem? I
have reproduced it manually and the steps are shared at [1] and
Sawada-San also reproduced it, see [2].

[1] - https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

07 February 2023, 11:19:50

On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Attached updated patches.
>

In back-branch patches, the change is as below:
+ *
+ * NB: the caller must hold ProcArrayLock in an exclusive mode regardless of
+ * already_locked which is unused now but kept for ABI compatibility.
  */
 void
 ProcArraySetReplicationSlotXmin(TransactionId xmin, TransactionId catalog_xmin,
  bool already_locked)
 {
- Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));
-
- if (!already_locked)
- LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(ProcArrayLock, LW_EXCLUSIVE));

This change looks odd to me. I think it would be better to pass
'already_locked' as true from the caller.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

07 February 2023, 19:49:03

Hi,

On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Attached updated patches.
> >
>
> Thanks, Andres, others, do you see a better way to fix this problem? I
> have reproduced it manually and the steps are shared at [1] and
> Sawada-San also reproduced it, see [2].
>
> [1] -
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> [2] - https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com

Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
already_locked=true worry me a lot less, because obviously that's not a very
frequent operation.

This is particularly not great because we need to acquire
ReplicationSlotControlLock while already holding ProcArrayLock.

But clearly there's a pretty large hole in the lock protection right now. I'm
a bit confused about why we (Robert and I, or just I) thought it's ok to do it
this way.

I wonder if we could instead invert the locks, and hold
ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().  That'd mean
that already_locked = true callers have to do a bit more work (we have to be
sure the locks are always acquired in the same order, or we end up in
unresolved deadlock land), but I think we can live with that.

This would still allow concurrent invocations of
ReplicationSlotsComputeRequiredXmin() come up with slightly different values,
but that's possible with the proposed patch as well, as effective_xmin is
updated without any of the other locks.  But I don't see a problem with that.

Greetings,

Andres Freund

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Andres Freund

Date:

07 February 2023, 20:05:20

Hi,

On 2023-02-07 11:49:03 -0800, Andres Freund wrote:
> On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Attached updated patches.
> > >
> >
> > Thanks, Andres, others, do you see a better way to fix this problem? I
> > have reproduced it manually and the steps are shared at [1] and
> > Sawada-San also reproduced it, see [2].
> >
> > [1] -
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > [2] -
https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
> 
> Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> already_locked=true worry me a lot less, because obviously that's not a very
> frequent operation.

Separately from this change:

I wonder if we ought to change the setup in CreateInitDecodingContext() to be a
bit less intricate. One idea:

Instead of having GetOldestSafeDecodingTransactionId() compute a value, that
we then enter into a slot, that then computes the global horizon via
ReplicationSlotsComputeRequiredXmin(), we could have a successor to
GetOldestSafeDecodingTransactionId() change procArray->replication_slot_xmin
(if needed).

As long as CreateInitDecodingContext() prevents a concurent
ReplicationSlotsComputeRequiredXmin(), by holding ReplicationSlotControlLock
exclusively, that should suffice to ensure that no "wrong" horizon was
determined / no needed rows have been removed. And we'd not need a lock nested
inside ProcArrayLock anymore.

Not sure if it's sufficiently better to be worth bothering with though :(

Greetings,

Andres Freund

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

08 February 2023, 04:13:16

On Wed, Feb 8, 2023 at 1:19 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Attached updated patches.
> > >
> >
> > Thanks, Andres, others, do you see a better way to fix this problem? I
> > have reproduced it manually and the steps are shared at [1] and
> > Sawada-San also reproduced it, see [2].
> >
> > [1] -
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > [2] -
https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
>
> Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> already_locked=true worry me a lot less, because obviously that's not a very
> frequent operation.
>
> This is particularly not great because we need to acquire
> ReplicationSlotControlLock while already holding ProcArrayLock.
>
>
> But clearly there's a pretty large hole in the lock protection right now. I'm
> a bit confused about why we (Robert and I, or just I) thought it's ok to do it
> this way.
>
>
> I wonder if we could instead invert the locks, and hold
> ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
> acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().
>

Along with inverting, doesn't this mean that we need to acquire
ReplicationSlotControlLock in Exclusive mode instead of acquiring it
in shared mode? My understanding of the above locking scheme is that
in CreateInitDecodingContext(), we acquire ReplicationSlotControlLock
in Exclusive mode before acquiring ProcArrayLock in Exclusive mode and
release it after releasing ProcArrayLock. Then,
ReplicationSlotsComputeRequiredXmin() acquires
ReplicationSlotControlLock in Exclusive mode only when already_locked
is false and releases it after a call to
ProcArraySetReplicationSlotXmin(). ProcArraySetReplicationSlotXmin()
won't change.

I don't think just inverting the order without changing the lock mode
will solve the problem because still apply worker will be able to
override the replication_slot_xmin value.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Amit Kapila

Date:

08 February 2023, 04:47:58

On Wed, Feb 8, 2023 at 1:35 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-02-07 11:49:03 -0800, Andres Freund wrote:
> > On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Attached updated patches.
> > > >
> > >
> > > Thanks, Andres, others, do you see a better way to fix this problem? I
> > > have reproduced it manually and the steps are shared at [1] and
> > > Sawada-San also reproduced it, see [2].
> > >
> > > [1] -
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > > [2] -
https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
> >
> > Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> > the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> > non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> > already_locked=true worry me a lot less, because obviously that's not a very
> > frequent operation.
>
> Separately from this change:
>
> I wonder if we ought to change the setup in CreateInitDecodingContext() to be a
> bit less intricate. One idea:
>
> Instead of having GetOldestSafeDecodingTransactionId() compute a value, that
> we then enter into a slot, that then computes the global horizon via
> ReplicationSlotsComputeRequiredXmin(), we could have a successor to
> GetOldestSafeDecodingTransactionId() change procArray->replication_slot_xmin
> (if needed).
>
> As long as CreateInitDecodingContext() prevents a concurent
> ReplicationSlotsComputeRequiredXmin(), by holding ReplicationSlotControlLock
> exclusively, that should suffice to ensure that no "wrong" horizon was
> determined / no needed rows have been removed. And we'd not need a lock nested
> inside ProcArrayLock anymore.
>
>
> Not sure if it's sufficiently better to be worth bothering with though :(
>

I am also not sure because it would improve concurrency for
CreateInitDecodingContext() which shouldn't be called at a higher
frequency. Also, to some extent, the current coding or the approach we
are discussing is easier to follow as we would always update
procArray->replication_slot_xmin after checking all the slots.

-- 
With Regards,
Amit Kapila.

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Masahiko Sawada

Date:

09 February 2023, 06:32:01

On Wed, Feb 8, 2023 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Feb 8, 2023 at 1:19 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Attached updated patches.
> > > >
> > >
> > > Thanks, Andres, others, do you see a better way to fix this problem? I
> > > have reproduced it manually and the steps are shared at [1] and
> > > Sawada-San also reproduced it, see [2].
> > >
> > > [1] -
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > > [2] -
https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
> >
> > Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> > the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> > non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> > already_locked=true worry me a lot less, because obviously that's not a very
> > frequent operation.
> >
> > This is particularly not great because we need to acquire
> > ReplicationSlotControlLock while already holding ProcArrayLock.
> >
> >
> > But clearly there's a pretty large hole in the lock protection right now. I'm
> > a bit confused about why we (Robert and I, or just I) thought it's ok to do it
> > this way.
> >
> >
> > I wonder if we could instead invert the locks, and hold
> > ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
> > acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().
> >
>
> Along with inverting, doesn't this mean that we need to acquire
> ReplicationSlotControlLock in Exclusive mode instead of acquiring it
> in shared mode? My understanding of the above locking scheme is that
> in CreateInitDecodingContext(), we acquire ReplicationSlotControlLock
> in Exclusive mode before acquiring ProcArrayLock in Exclusive mode and
> release it after releasing ProcArrayLock. Then,
> ReplicationSlotsComputeRequiredXmin() acquires
> ReplicationSlotControlLock in Exclusive mode only when already_locked
> is false and releases it after a call to
> ProcArraySetReplicationSlotXmin(). ProcArraySetReplicationSlotXmin()
> won't change.

I've attached the patch of this idea for discussion. In
GetOldestSafeDecodingTransactionId() called by
CreateInitDecodingContext(), we hold ReplicationSlotControlLock,
ProcArrayLock, and XidGenLock at a time. So we would need to be
careful about the ordering.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

fix_concurrent_slot_xmin_update.patch

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Daniel Gustafsson

Date:

20 July 2023, 07:34:29

> On 9 Feb 2023, at 07:32, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> I've attached the patch of this idea for discussion.

Amit, Andres: have you had a chance to look at the updated version of this
patch?

--
Daniel Gustafsson

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Robert Haas

Date:

05 January 2024, 15:57:25

This thread has gone for about a year here without making any
progress, which isn't great.

On Tue, Feb 7, 2023 at 2:49 PM Andres Freund <andres@anarazel.de> wrote:
> Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> already_locked=true worry me a lot less, because obviously that's not a very
> frequent operation.

Maybe, but it would be good to have some data indicating whether this
is really an issue.

> I wonder if we could instead invert the locks, and hold
> ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
> acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().  That'd mean
> that already_locked = true callers have to do a bit more work (we have to be
> sure the locks are always acquired in the same order, or we end up in
> unresolved deadlock land), but I think we can live with that.

This seems like it could be made to work, but there's apparently a
shortage of people willing to write the patch.

As another thought, Masahiko-san writes in his proposed commit message:

"As a result, the replication_slot_xmin could be overwritten with an
old value and retreated."

But what about just surgically preventing that?
ProcArraySetReplicationSlotXmin() could refuse to retreat the values,
perhaps? If it computes an older value than what's there, it just does
nothing?

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Assertion failure in SnapBuildInitialSnapshot()

From

vignesh C

Date:

11 January 2024, 14:25:52

On Thu, 9 Feb 2023 at 12:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Feb 8, 2023 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Feb 8, 2023 at 1:19 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > > > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > Attached updated patches.
> > > > >
> > > >
> > > > Thanks, Andres, others, do you see a better way to fix this problem? I
> > > > have reproduced it manually and the steps are shared at [1] and
> > > > Sawada-San also reproduced it, see [2].
> > > >
> > > > [1] -
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > > > [2] -
https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
> > >
> > > Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> > > the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> > > non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> > > already_locked=true worry me a lot less, because obviously that's not a very
> > > frequent operation.
> > >
> > > This is particularly not great because we need to acquire
> > > ReplicationSlotControlLock while already holding ProcArrayLock.
> > >
> > >
> > > But clearly there's a pretty large hole in the lock protection right now. I'm
> > > a bit confused about why we (Robert and I, or just I) thought it's ok to do it
> > > this way.
> > >
> > >
> > > I wonder if we could instead invert the locks, and hold
> > > ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
> > > acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().
> > >
> >
> > Along with inverting, doesn't this mean that we need to acquire
> > ReplicationSlotControlLock in Exclusive mode instead of acquiring it
> > in shared mode? My understanding of the above locking scheme is that
> > in CreateInitDecodingContext(), we acquire ReplicationSlotControlLock
> > in Exclusive mode before acquiring ProcArrayLock in Exclusive mode and
> > release it after releasing ProcArrayLock. Then,
> > ReplicationSlotsComputeRequiredXmin() acquires
> > ReplicationSlotControlLock in Exclusive mode only when already_locked
> > is false and releases it after a call to
> > ProcArraySetReplicationSlotXmin(). ProcArraySetReplicationSlotXmin()
> > won't change.
>
> I've attached the patch of this idea for discussion. In
> GetOldestSafeDecodingTransactionId() called by
> CreateInitDecodingContext(), we hold ReplicationSlotControlLock,
> ProcArrayLock, and XidGenLock at a time. So we would need to be
> careful about the ordering.

I have changed the status of the patch to "Waiting on Author" as
Robert's issues were not addressed yet. Feel free to change the status
accordingly after addressing them.

Regards,
Vignesh

Re: Assertion failure in SnapBuildInitialSnapshot()

From

vignesh C

Date:

01 February 2024, 18:20:55

On Thu, 11 Jan 2024 at 19:55, vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 9 Feb 2023 at 12:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Feb 8, 2023 at 1:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Feb 8, 2023 at 1:19 AM Andres Freund <andres@anarazel.de> wrote:
> > > >
> > > > On 2023-02-01 11:23:57 +0530, Amit Kapila wrote:
> > > > > On Tue, Jan 31, 2023 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > Attached updated patches.
> > > > > >
> > > > >
> > > > > Thanks, Andres, others, do you see a better way to fix this problem? I
> > > > > have reproduced it manually and the steps are shared at [1] and
> > > > > Sawada-San also reproduced it, see [2].
> > > > >
> > > > > [1] -
https://www.postgresql.org/message-id/CAA4eK1KDFeh%3DZbvSWPx%3Dir2QOXBxJbH0K8YqifDtG3xJENLR%2Bw%40mail.gmail.com
> > > > > [2] -
https://www.postgresql.org/message-id/CAD21AoDKJBB6p4X-%2B057Vz44Xyc-zDFbWJ%2Bg9FL6qAF5PC2iFg%40mail.gmail.com
> > > >
> > > > Hm. It's worrysome to now hold ProcArrayLock exclusively while iterating over
> > > > the slots. ReplicationSlotsComputeRequiredXmin() can be called at a
> > > > non-neglegible frequency.  Callers like CreateInitDecodingContext(), that pass
> > > > already_locked=true worry me a lot less, because obviously that's not a very
> > > > frequent operation.
> > > >
> > > > This is particularly not great because we need to acquire
> > > > ReplicationSlotControlLock while already holding ProcArrayLock.
> > > >
> > > >
> > > > But clearly there's a pretty large hole in the lock protection right now. I'm
> > > > a bit confused about why we (Robert and I, or just I) thought it's ok to do it
> > > > this way.
> > > >
> > > >
> > > > I wonder if we could instead invert the locks, and hold
> > > > ReplicationSlotControlLock until after ProcArraySetReplicationSlotXmin(), and
> > > > acquire ProcArrayLock just for ProcArraySetReplicationSlotXmin().
> > > >
> > >
> > > Along with inverting, doesn't this mean that we need to acquire
> > > ReplicationSlotControlLock in Exclusive mode instead of acquiring it
> > > in shared mode? My understanding of the above locking scheme is that
> > > in CreateInitDecodingContext(), we acquire ReplicationSlotControlLock
> > > in Exclusive mode before acquiring ProcArrayLock in Exclusive mode and
> > > release it after releasing ProcArrayLock. Then,
> > > ReplicationSlotsComputeRequiredXmin() acquires
> > > ReplicationSlotControlLock in Exclusive mode only when already_locked
> > > is false and releases it after a call to
> > > ProcArraySetReplicationSlotXmin(). ProcArraySetReplicationSlotXmin()
> > > won't change.
> >
> > I've attached the patch of this idea for discussion. In
> > GetOldestSafeDecodingTransactionId() called by
> > CreateInitDecodingContext(), we hold ReplicationSlotControlLock,
> > ProcArrayLock, and XidGenLock at a time. So we would need to be
> > careful about the ordering.
>
> I have changed the status of the patch to "Waiting on Author" as
> Robert's issues were not addressed yet. Feel free to change the status
> accordingly after addressing them.

The patch which you submitted has been awaiting your attention for
quite some time now.  As such, we have moved it to "Returned with
Feedback" and removed it from the reviewing queue. Depending on
timing, this may be reversible.  Kindly address the feedback you have
received, and resubmit the patch to the next CommitFest.

Regards,
Vignesh

Re: Assertion failure in SnapBuildInitialSnapshot()

From

Alexander Lakhin

Date:

11 June 2024, 19:00:01

Hello,

01.02.2024 21:20, vignesh C wrote:
> The patch which you submitted has been awaiting your attention for
> quite some time now.  As such, we have moved it to "Returned with
> Feedback" and removed it from the reviewing queue. Depending on
> timing, this may be reversible.  Kindly address the feedback you have
> received, and resubmit the patch to the next CommitFest.

While analyzing buildfarm failures, I found [1], which demonstrates the
assertion failure discussed here:
---
031_column_list_publisher.log
TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(safeXid, snap->xmin)", File: 
"/home/bf/bf-build/skink/REL_15_STABLE/pgsql.build/../pgsql/src/backend/replication/logical/snapbuild.c", Line: 614, 
PID: 1882382)
---

I've managed to reproduce the assertion failure on REL_15_STABLE with the
following modification:
@@ -3928,6 +3928,7 @@ ProcArraySetReplicationSlotXmin(TransactionId xmin, TransactionId catalog_xmin,
  {
      Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));

+pg_usleep(1000);
      if (!already_locked)
          LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);

using the script:
numjobs=100
createdb db
export PGDATABASE=db

for ((i=1;i<=100;i++)); do
echo "iteration $i"

for ((j=1;j<=numjobs;j++)); do
echo "
SELECT pg_create_logical_replication_slot('s$j', 'test_decoding');
SELECT txid_current();
" | psql >>/dev/null 2>&1 &

echo "
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
CREATE_REPLICATION_SLOT slot$j LOGICAL test_decoding USE_SNAPSHOT;
" | psql -d "dbname=db replication=database" >>/dev/null 2>&1 &
done
wait

for ((j=1;j<=numjobs;j++)); do
echo "
DROP_REPLICATION_SLOT slot$j;
" | psql -d "dbname=db replication=database" >/dev/null

echo "SELECT pg_drop_replication_slot('s$j');" | psql >/dev/null
done

grep 'TRAP' server.log && break;
done

(with
wal_level = logical
max_replication_slots = 200
max_wal_senders = 200
in postgresql.conf)

iteration 18
ERROR:  replication slot "slot13" is active for PID 538431
TRAP: FailedAssertion("TransactionIdPrecedesOrEquals(safeXid, snap->xmin)", File: "snapbuild.c", Line: 614, PID:
538431)


I've also confirmed that fix_concurrent_slot_xmin_update.patch fixes the
issue.

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-05-15%2020%3A55%3A17

Best regards,
Alexander