Re: Allow logical failover slots to wait on synchronous replication - Mailing list pgsql-hackers

From shveta malik
Subject Re: Allow logical failover slots to wait on synchronous replication
Date
Msg-id CAJpy0uCX3KnNZCM73Q79oTO5Y8cicxN=YWsQTdq6CX194REnfw@mail.gmail.com
Whole thread Raw
In response to Re: Allow logical failover slots to wait on synchronous replication  (John H <johnhyvr@gmail.com>)
List pgsql-hackers
On Fri, Aug 30, 2024 at 12:56 AM John H <johnhyvr@gmail.com> wrote:
>
> Hi Shveta,
>
> Thanks for reviewing it so quickly.
>
> On Thu, Aug 29, 2024 at 2:35 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Thanks for the patch. Few comments and queries:
> >
> > 1)
> > + static XLogRecPtr       lsn[NUM_SYNC_REP_WAIT_MODE];
> >
> > We shall name it as 'lsns' as there are multiple.
> >
>
> This follows the same naming convention in walsender_private.h
>
> > 2)
> >
> > + for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
> > + {
> > + lsn[i] = InvalidXLogRecPtr;
> > + }
> >
> > Can we do it like below similar to what you have done at another place:
> > memset(lsn, InvalidXLogRecPtr, sizeof(lsn));
> >
>
> I don't think memset works in this case? Well, I think *technically* works but
> not sure if that's something worth optimizing.
> If I understand correctly, memset takes in a char for the value and
> not XLogRecPtr (uint64).
>
> So something like: memset(lsn, 0, sizeof(lsn))
>
> InvalidXLogRecPtr is defined as 0 so I think it works but there's an
> implicit dependency here
> for correctness.
>
> > 3)
> > + if (!initialized)
> > + {
> > + for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
> > + {
> > + lsn[i] = InvalidXLogRecPtr;
> > + }
> > + }
> >
> > I do not see 'initialized' set to TRUE anywhere. Can you please
> > elaborate the intent here?
>
> You're right I thought I fixed this. WIll update.
>
> >
> > 4)
> > + int mode = SyncRepWaitMode;
> > + int i;
> > +
> > + if (!initialized)
> > + {
> > + for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
> > + {
> > + lsn[i] = InvalidXLogRecPtr;
> > + }
> > + }
> > + if (mode == SYNC_REP_NO_WAIT)
> > + return true;
> >
> > I do not understand this code well. As Amit also pointed out,  'mode'
> > may change. When we initialize 'mode' lets say SyncRepWaitMode is
> > SYNC_REP_NO_WAIT but by the time we check 'if (mode ==
> > SYNC_REP_NO_WAIT)', SyncRepWaitMode has changed to say
> > SYNC_REP_WAIT_FLUSH, if so, then we will wrongly return true from
> > here. Is that a possibility? ProcessConfigFile() is in the caller, and
> > thus we may end up using the wrong mode.
> >
>
> Yes it's possible for mode to change. In my comment to Amit in the other thread,
> I think we have to store mode and base our execution of this logic and ignore
> SyncRepWaitMode changing due to ProcesConfigFile/SIGHUP for one iteration.
>
> We can store the value of mode later, so something like:
>
> > if (SyncRepWaitMode == SYNC_REP_NO_WAIT)
> >     return true;
> > mode = SyncRepWaitMode
> > if (lsn[mode] >= wait_for_lsn)
> >   return true;
>
> But it's the same issue which is when you check lsn[mode],
> SyncRepWaitMode could have changed to
> something else, so you always have to initialize the value and will
> always have this discrepancy.
>
> I'm skeptical end users are changing SyncRepWaitMode in their database

> clusters as
> it has implications for their durability and I would assume they run
> with the same durability guarantees.
>

I was trying to have a look at the patch again, it doesn't apply on
the head, needs rebase.

Regarding 'mode = SyncRepWaitMode', FWIW, SyncRepWaitForLSN() also
does in a similar way. It gets mode in local var initially and uses it
later. See [1]. So isn't there a chance too that 'SyncRepWaitMode' can
change in between.

[1]:
mode = SyncRepWaitMode;
.....
....
        if (!WalSndCtl->sync_standbys_defined ||
                lsn <= WalSndCtl->lsn[mode])
        {
                LWLockRelease(SyncRepLock);
                return;
        }



thanks
Shveta



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Invalid Assert while validating REPLICA IDENTITY?
Next
From: jian he
Date:
Subject: Re: not null constraints, again