Re: crash with synchronized_standby_slots - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: crash with synchronized_standby_slots
Date
Msg-id CAA4eK1+yxew1rh5p4kExPa+utwcBc=KC9Y7NW8w_qJY9bzdfug@mail.gmail.com
Whole thread Raw
In response to Re: crash with synchronized_standby_slots  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Tue, Dec 3, 2024 at 10:34 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2024-Nov-29, Amit Kapila wrote:
>
> BTW it occurs to me that there might well be some sort of thundering
> herd problem if every process needs to run the check_hook when a SIGHUP
> is broadcast, and they'll all be waiting on that particular lwlock and
> run the same validation locally again and again.  I bet if you have a
> few thousand backends (hi Jakub! [1]) it's problematic.
>

The lock is taken in shared mode, so, ideally, it shouldn't create a
problem but if it ever creates a problem, we can even skip that check
during validation. The validation will anyway happen later during
replication in StandbySlotsHaveCaughtup(). This check is mostly to
detect the error in GUC early.

>  Maybe we need a
> different way to validate the GUC, but I don't know what that might be;
> but doing the validation once and storing the result in shmem might be
> better.
>

What if that particular GUC changes again? We may need some additional
invalidation mechanism.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Remove useless casts to (void *)
Next
From: Amit Kapila
Date:
Subject: Re: Memory leak in WAL sender with pgoutput (v10~)