Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary |
Date | |
Msg-id | CAD21AoA1iVkgjv8OsdNDQ01wsWKgRqvfKmb5rBQqDUjeo7kQvw@mail.gmail.com Whole thread Raw |
In response to | Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Fix assert failure when decoding XLOG_PARAMETER_CHANGE on primary
|
List | pgsql-hackers |
On Mon, Feb 10, 2025 at 4:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Feb 7, 2025 at 11:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Feb 6, 2025 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > True but it sounds like there is more harm than benefit. It seems > > > reasonable to do this on HEAD. Shall we think of doing it differently > > > in HEAD and back-branches or let's restrict as your v2 patch is doing > > > and if by any chance users complain about it we can try to fix it in > > > another way? > > > > While the latter would be good for maintainability it would not be > > advisable to change the behavior back and forth in back branches. I'd > > like to make it clear what point of v2 patch (restring server startup > > for pre-existing logical slots) is preferable over the first patch > > (removing InHotStandby condition)? > > > > Sorry, I didn't understand your question. I wanted to mean that there is another direction to fix this issue by applying the v1 patch to both HEAD and back-branches. It might be worth considering this option too if the v1 patch's approach is reasonable. > > > If the approach of the first patch > > is reasonably good, we could take it both in HEAD and backbranches. > > > > > > > > > > > > > But, similarly, is > > > > there any concerns of my proposed fix? > > > > > > > > > > - if (InRecovery && InHotStandby && > > > + if (InRecovery && > > > xlrec.wal_level < WAL_LEVEL_LOGICAL && > > > > > > Won't the InRecovery be true even for crash-recovery as well? I think > > > we need to check standby mode here. > > > > I think if we check standby mode here (i.e., adding StandbyMode), we > > would not be able to invalidate logical slots during archive recovery. > > That is, in the following scenario, we would hit the same assertion > > failure: > > > > 1. setup the primary and the standby servers (with hot_standby=on). > > 2. create a logical slot on the standby. > > 3. on standby, set archive recovery settings (setting restore_command, > > removing standby.signal, and creating recovery.signal etc.). > > 4. on primary, lower wal_level to replica and restart. > > 5. start standby (in archive recovery mode). > > 6. execute the logical decoding after the archive recovery completes. > > > > And, you're right that InRecovery is true even for crash-recovery. But > > is there any case where we invalidate logical slots due to > > XLOG_PARAMETER_CHANGE during a crash recovery? > > > > I am not sure but what about the cases where the server crashes > immediately after logging this WAL record and the user reduced the > wal_lvel before the next restart? I think in such a case we may ERROR > out while restoring slots from the disk which will be before we will > try to replay the WAL. I think so too. > However, I am not sure if it is a good idea to > rely on that assumption. Agreed, I'm fine with leaving InRecovery in this condition. I think the point is whether we should add StandbyMode to the condition or not. I think if we do that, we would end up with the same error in the above scenario I described. So does the following condition make sense? if (InRecovery && xlrec.wal_level < WAL_LEVEL_LOGICAL && wal_level >= WAL_LEVEL_LOGICAL) InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, 0, InvalidOid, InvalidTransactionId); Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: