Re: sync_standbys_defined read/write race on startup - Mailing list pgsql-hackers

From Maksim.Melnikov
Subject Re: sync_standbys_defined read/write race on startup
Date
Msg-id 884d9875-d35d-44fb-993d-f9326e5282a8@postgrespro.ru
Whole thread Raw
In response to Re: sync_standbys_defined read/write race on startup  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers


On 27.03.2025 01:13, Michael Paquier wrote:
Tracking if the configuration has been properly loaded in
WalSndCtlData makes sense here, but I am confused by the patch: you
have defined SYNCSTANDBYSDEFINED but sync_standbys_defined never sets
it.  It may be simpler to use a separate boolean flag for this
purpose.  WalSndCtlData is a private structure holding the state of
the WAL senders internally, so ABI does not stress me much here
(flexible array at the end of the structure, as well..).

Thanks for your responses.
Yes, I agree that it looks more complicated then addition new bool field,
but there is place in SyncRepWaitForLSN method, where we check
WalSndCtlData->sync_standbys_defined out of acquire/release block.

void
SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
{
    int            mode;
...............................................
    if (!SyncRepRequested() ||
        !((volatile WalSndCtlData *) WalSndCtl)->sync_standbys_defined)
        return;

..........................

If we just add new bool field, we can't atomically check sync_standbys_defined flag
and new bool field, only in lock, so we lose fast exit optimization. But I agree
that in patch I lost SYNCSTANDBYSDEFINED setting, thanks for attentiveness,
so I've attached new patch with fix,also I've done bits setting more evident. 

Anyway, if you still thinks that my arguments not enough, please notify, I will do patch with separate flag.

On 27.03.2025 01:13, Michael Paquier wrote:
Also mentioned by Kirill downthread: let's add a regression test with
an injection point that does a wait.  This race condition is worth
having a test for.  You could define the point of where you have added
your hardcoded sleep(), for example.
Sorry, but for me it seems that we can't use injection points toolset here, because we are talking
about startup process, and we can't call injection_points_attach on client backend before checkpointer.

Anyway, if you know how to do it, please share details, I will do.


Best regards, Maksim Melnikov

Attachment

pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: duplicated comments on get_relation_constraints
Next
From: Peter Eisentraut
Date:
Subject: Re: Thread-safe nl_langinfo() and localeconv()