Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary |
Date | |
Msg-id | CAD21AoCKuUHWPCLEE_y92Gunr1Z1hjFdutgJdQTSHVej4iPiBA@mail.gmail.com Whole thread Raw |
In response to | Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On Thu, Feb 6, 2025 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Feb 7, 2025 at 12:11 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Feb 5, 2025 at 10:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Feb 6, 2025 at 12:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > I've updated the patch accordingly. > > > > > > > > > > Today, again thinking about the proposed fix, I was wondering about > > > the following case. Say, on hot_standby, the user created a logical > > > slot, then shut down hot_standby, turn off the hot_standby flag, and > > > restart standby. This is allowed today but not after the patch. It is > > > possible that the user can promote a non-hot_standby server after its > > > restart in the previous step and can reuse the logical slot. So, is it > > > okay to change this behavior? If not, then we may need to go back to > > > the first fix proposed by you in this thread. > > > > While non hot standby, no one can advance logical slots, meaning the > > standby server ends up accumulating WAL records and also causing the > > bloat on the primary if hot_standby_feedback is enabled. Given this > > outcome, I though that the current patch approach would be reasonable. > > > > Agreed as I also can't think of any case where the user would require > a logical slot on a non-hotstandby server. > > > On the other hand, as you pointed out, it would not be good in terms > > of compatibility as we end up limiting potentially existing use cases. > > And it would be prefectly fine if users promote the standby server > > shortly after starting up with hot_standby = off. > > True but it sounds like there is more harm than benefit. It seems > reasonable to do this on HEAD. Shall we think of doing it differently > in HEAD and back-branches or let's restrict as your v2 patch is doing > and if by any chance users complain about it we can try to fix it in > another way? While the latter would be good for maintainability it would not be advisable to change the behavior back and forth in back branches. I'd like to make it clear what point of v2 patch (restring server startup for pre-existing logical slots) is preferable over the first patch (removing InHotStandby condition)? If the approach of the first patch is reasonably good, we could take it both in HEAD and backbranches. > > > > > But, similarly, is > > there any concerns of my proposed fix? > > > > - if (InRecovery && InHotStandby && > + if (InRecovery && > xlrec.wal_level < WAL_LEVEL_LOGICAL && > > Won't the InRecovery be true even for crash-recovery as well? I think > we need to check standby mode here. I think if we check standby mode here (i.e., adding StandbyMode), we would not be able to invalidate logical slots during archive recovery. That is, in the following scenario, we would hit the same assertion failure: 1. setup the primary and the standby servers (with hot_standby=on). 2. create a logical slot on the standby. 3. on standby, set archive recovery settings (setting restore_command, removing standby.signal, and creating recovery.signal etc.). 4. on primary, lower wal_level to replica and restart. 5. start standby (in archive recovery mode). 6. execute the logical decoding after the archive recovery completes. And, you're right that InRecovery is true even for crash-recovery. But is there any case where we invalidate logical slots due to XLOG_PARAMETER_CHANGE during a crash recovery? Also, I guess InRecovery is always true when we call xlog_redo(). Do we need to check it in the first place? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: