Thread: Track in pg_replication_slots the reason why slots conflict?
Hi all, (Bertrand and Andres in CC.) While listening at Bertrand's talk about logical decoding on standbys last week at Prague, I got surprised by the fact that we do not reflect in the catalogs the reason why a conflict happened for a slot. There are three of them depending on ReplicationSlotInvalidationCause: - WAL removed. - Invalid horizon. - Insufficient WAL level. This idea has been hinted around here on the original thread that led to be87200efd93: https://www.postgresql.org/message-id/d7547f2c-a0c3-6aad-b631-b7ed5efaf298@gmail.com However v44 has picked up the idea of a boolean: https://www.postgresql.org/message-id/bdc49e0b-cd39-bcd3-e391-b0ad6e48b5cf@gmail.com ReplicationSlotCtl holds this information, so couldn't it be useful for monitoring purposes to know why a slot got invalidated and add a column to pg_get_replication_slots()? This could just be an extra text conflicting_reason, defaulting to NULL when there's nothing to see. Thoughts? -- Michael
Attachment
On Thu, Dec 21, 2023 at 5:51 AM Michael Paquier <michael@paquier.xyz> wrote: > > While listening at Bertrand's talk about logical decoding on standbys > last week at Prague, I got surprised by the fact that we do not > reflect in the catalogs the reason why a conflict happened for a slot. > There are three of them depending on ReplicationSlotInvalidationCause: > - WAL removed. > - Invalid horizon. > - Insufficient WAL level. > The invalidation cause is also required by one of the features being discussed "Synchronize slots from primary to standby" [1] and there is already a thread to discuss the same [2]. As that thread started yesterday only, you may not have noticed it. Currently, the proposal is to expose it via a function but we can extend it to also display via view, feel free to share your opinion on that thread. [1] - https://www.postgresql.org/message-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com [2] - https://www.postgresql.org/message-id/CAJpy0uBpr0ym12%2B0mXpjcRFA6N%3DanX%2BYk9aGU4EJhHNu%3DfWykQ%40mail.gmail.com -- With Regards, Amit Kapila.
On Thu, Dec 21, 2023 at 08:20:16AM +0530, Amit Kapila wrote: > The invalidation cause is also required by one of the features being > discussed "Synchronize slots from primary to standby" [1] and there is > already a thread to discuss the same [2]. As that thread started > yesterday only, you may not have noticed it. Currently, the proposal > is to expose it via a function but we can extend it to also display > via view, feel free to share your opinion on that thread. > > [1] - https://www.postgresql.org/message-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com > [2] - https://www.postgresql.org/message-id/CAJpy0uBpr0ym12%2B0mXpjcRFA6N%3DanX%2BYk9aGU4EJhHNu%3DfWykQ%40mail.gmail.com Ah thanks, missed this one. This cannot use a separate function, actually, and there is a good reason for that that has not been mentioned. I'll jump there. -- Michael
Attachment
Hi, On 2023-12-21 09:21:04 +0900, Michael Paquier wrote: > While listening at Bertrand's talk about logical decoding on standbys > last week at Prague, I got surprised by the fact that we do not > reflect in the catalogs the reason why a conflict happened for a slot. > There are three of them depending on ReplicationSlotInvalidationCause: > - WAL removed. > - Invalid horizon. > - Insufficient WAL level. It should be extremely rare to hit any of these other than "WAL removed", so I'm not sure it's worth adding interface complexity to show them. > ReplicationSlotCtl holds this information, so couldn't it be useful > for monitoring purposes to know why a slot got invalidated and add a > column to pg_get_replication_slots()? This could just be an extra > text conflicting_reason, defaulting to NULL when there's nothing to > see. Extra columns aren't free from a usability perspective. IFF we do something, I think it should be a single column with a cause. Greetings, Andres Freund
On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-12-21 09:21:04 +0900, Michael Paquier wrote: > > While listening at Bertrand's talk about logical decoding on standbys > > last week at Prague, I got surprised by the fact that we do not > > reflect in the catalogs the reason why a conflict happened for a slot. > > There are three of them depending on ReplicationSlotInvalidationCause: > > - WAL removed. > > - Invalid horizon. > > - Insufficient WAL level. > > It should be extremely rare to hit any of these other than "WAL removed", so > I'm not sure it's worth adding interface complexity to show them. > > > > ReplicationSlotCtl holds this information, so couldn't it be useful > > for monitoring purposes to know why a slot got invalidated and add a > > column to pg_get_replication_slots()? This could just be an extra > > text conflicting_reason, defaulting to NULL when there's nothing to > > see. > > Extra columns aren't free from a usability perspective. IFF we do something, I > think it should be a single column with a cause. Thanks for the feedback. But do you mean that we replace existing 'conflicting' column with 'cause' in both the function and view (pg_get_replication_slots() and pg_replication_slots)? Or do you mean that we expose 'cause' from pg_get_replication_slots() and use that to display 'conflicting' in pg_replication_slots view? And if we plan to return/display cause from either function or view, then shall it be enum 'ReplicationSlotInvalidationCause' or description/text corresponding to enum? In the other feature being discussed "Synchronize slots from primary to standby" [1] , there is a requirement to replicate invalidation cause of slot from the primary to standby and thus it is needed in enum form there. And thus there was a suggestion earlier to have the function return enum-value and let the view display it as text/description to the user. So kindly let us know your thoughts. [1] - https://www.postgresql.org/message-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com thanks Shveta
Hi, On 2023-12-21 16:08:48 +0530, shveta malik wrote: > On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > > > On 2023-12-21 09:21:04 +0900, Michael Paquier wrote: > > > While listening at Bertrand's talk about logical decoding on standbys > > > last week at Prague, I got surprised by the fact that we do not > > > reflect in the catalogs the reason why a conflict happened for a slot. > > > There are three of them depending on ReplicationSlotInvalidationCause: > > > - WAL removed. > > > - Invalid horizon. > > > - Insufficient WAL level. > > > > It should be extremely rare to hit any of these other than "WAL removed", so > > I'm not sure it's worth adding interface complexity to show them. > > > > > > > ReplicationSlotCtl holds this information, so couldn't it be useful > > > for monitoring purposes to know why a slot got invalidated and add a > > > column to pg_get_replication_slots()? This could just be an extra > > > text conflicting_reason, defaulting to NULL when there's nothing to > > > see. > > > > Extra columns aren't free from a usability perspective. IFF we do something, I > > think it should be a single column with a cause. > > Thanks for the feedback. But do you mean that we replace existing > 'conflicting' column with 'cause' in both the function and view > (pg_get_replication_slots() and pg_replication_slots)? Or do you mean > that we expose 'cause' from pg_get_replication_slots() and use that to > display 'conflicting' in pg_replication_slots view? I'm not entirely sure I understand the difference - just whether we add one new column or replace the existing 'conflicting' column? I can see arguments for either. A conflicting column where NULL indicates no conflict, and other values indicate the reason for the conflict, doesn't seem too bad. > And if we plan to return/display cause from either function or view, > then shall it be enum 'ReplicationSlotInvalidationCause' or > description/text corresponding to enum? We clearly can't just expose the numerical value for a C enum. So it has to be converted to something SQL representable. > In the other feature being discussed "Synchronize slots from primary > to standby" [1] , there is a requirement to replicate invalidation > cause of slot from the primary to standby and thus it is needed in > enum form there. And thus there was a suggestion earlier to have the > function return enum-value and let the view display it as > text/description to the user. So kindly let us know your thoughts. > > [1] - https://www.postgresql.org/message-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com Can you point me to a more specific message for that requirement? It seems pretty odd to me. Your link goes to the top of a 400 message thread, I don't have time to find one specific design point in that... Greetings, Andres
On Thu, Dec 21, 2023 at 5:04 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-12-21 16:08:48 +0530, shveta malik wrote: > > On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote: > > > > > > Hi, > > > > > > On 2023-12-21 09:21:04 +0900, Michael Paquier wrote: > > > > While listening at Bertrand's talk about logical decoding on standbys > > > > last week at Prague, I got surprised by the fact that we do not > > > > reflect in the catalogs the reason why a conflict happened for a slot. > > > > There are three of them depending on ReplicationSlotInvalidationCause: > > > > - WAL removed. > > > > - Invalid horizon. > > > > - Insufficient WAL level. > > > > > > It should be extremely rare to hit any of these other than "WAL removed", so > > > I'm not sure it's worth adding interface complexity to show them. > > > > > > > > > > ReplicationSlotCtl holds this information, so couldn't it be useful > > > > for monitoring purposes to know why a slot got invalidated and add a > > > > column to pg_get_replication_slots()? This could just be an extra > > > > text conflicting_reason, defaulting to NULL when there's nothing to > > > > see. > > > > > > Extra columns aren't free from a usability perspective. IFF we do something, I > > > think it should be a single column with a cause. > > > > Thanks for the feedback. But do you mean that we replace existing > > 'conflicting' column with 'cause' in both the function and view > > (pg_get_replication_slots() and pg_replication_slots)? Or do you mean > > that we expose 'cause' from pg_get_replication_slots() and use that to > > display 'conflicting' in pg_replication_slots view? > > I'm not entirely sure I understand the difference - just whether we add one > new column or replace the existing 'conflicting' column? I can see arguments > for either. A conflicting column where NULL indicates no conflict, and other > values indicate the reason for the conflict, doesn't seem too bad. > > > > And if we plan to return/display cause from either function or view, > > then shall it be enum 'ReplicationSlotInvalidationCause' or > > description/text corresponding to enum? > > We clearly can't just expose the numerical value for a C enum. So it has to be > converted to something SQL representable. > > > > In the other feature being discussed "Synchronize slots from primary > > to standby" [1] , there is a requirement to replicate invalidation > > cause of slot from the primary to standby and thus it is needed in > > enum form there. And thus there was a suggestion earlier to have the > > function return enum-value and let the view display it as > > text/description to the user. So kindly let us know your thoughts. > > > > [1] - https://www.postgresql.org/message-id/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com > > Can you point me to a more specific message for that requirement? It seems > pretty odd to me. Your link goes to the top of a 400 message thread, I don't > have time to find one specific design point in that... It is currently implemented there as a new function 'pg_get_slot_invalidation_cause()' without changing existing view pg_replication_slots. (See 2.1 in [1] where it was introduced). Then it was suggested in [2] to fork a new thread as it makes sense to have it independent of this slot-synchronization feature. The new thread forked is [3]. In that thread, the issues in having a new function pg_get_slot_invalidation_cause() are discussed and also we came to know about this very thread that started the next day. [1]: https://www.postgresql.org/message-id/CAJpy0uAuzbzvcjpnzFTiWuDBctnH-SDZC6AZabPX65x9GWBrjQ%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CAA4eK1K0KCDNtpDyUKucMRdyK-5KdrCRWakCpHEdHT9muAiEOw%40mail.gmail.com [3]: https://www.postgresql.org/message-id/CAJpy0uBpr0ym12%2B0mXpjcRFA6N%3DanX%2BYk9aGU4EJhHNu%3DfWykQ%40mail.gmail.com thanks Shveta
On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote: > > On 2023-12-21 16:08:48 +0530, shveta malik wrote: > > On Thu, Dec 21, 2023 at 3:10 PM Andres Freund <andres@anarazel.de> wrote: > > > > > > Extra columns aren't free from a usability perspective. IFF we do something, I > > > think it should be a single column with a cause. > > > > Thanks for the feedback. But do you mean that we replace existing > > 'conflicting' column with 'cause' in both the function and view > > (pg_get_replication_slots() and pg_replication_slots)? Or do you mean > > that we expose 'cause' from pg_get_replication_slots() and use that to > > display 'conflicting' in pg_replication_slots view? > > I'm not entirely sure I understand the difference - just whether we add one > new column or replace the existing 'conflicting' column? I can see arguments > for either. > Agreed. I think the argument against replacing the existing 'conflicting' column is that there is a chance that it is being used by some monitoring script which I guess shouldn't be a big deal to change. So, if we don't see that as a problem, I would prefer to have a single column with conflict reason where one of its values indicates there is no conflict. A conflicting column where NULL indicates no conflict, and other > values indicate the reason for the conflict, doesn't seem too bad. > This is fine too. > > > And if we plan to return/display cause from either function or view, > > then shall it be enum 'ReplicationSlotInvalidationCause' or > > description/text corresponding to enum? > > We clearly can't just expose the numerical value for a C enum. So it has to be > converted to something SQL representable. > We can return int2 value from the function pg_get_replication_slots() and then use that to display a string in the view pg_replication_slots. -- With Regards, Amit Kapila.
Hi, On Thu, Dec 21, 2023 at 07:55:51PM +0530, Amit Kapila wrote: > On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote: > > I'm not entirely sure I understand the difference - just whether we add one > > new column or replace the existing 'conflicting' column? I can see arguments > > for either. > > > > Agreed. I think the argument against replacing the existing > 'conflicting' column is that there is a chance that it is being used > by some monitoring script which I guess shouldn't be a big deal to > change. So, if we don't see that as a problem, I would prefer to have > a single column with conflict reason where one of its values indicates > there is no conflict. +1 > A conflicting column where NULL indicates no conflict, and other > > values indicate the reason for the conflict, doesn't seem too bad. > > > > This is fine too. +1 > > > > > And if we plan to return/display cause from either function or view, > > > then shall it be enum 'ReplicationSlotInvalidationCause' or > > > description/text corresponding to enum? > > > > We clearly can't just expose the numerical value for a C enum. So it has to be > > converted to something SQL representable. > > > > We can return int2 value from the function pg_get_replication_slots() > and then use that to display a string in the view > pg_replication_slots. Yeah, and in the sync slot related work we could use pg_get_replication_slots() then to get the enum. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 2023-12-21 19:55:51 +0530, Amit Kapila wrote: > On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote: > > We clearly can't just expose the numerical value for a C enum. So it has to be > > converted to something SQL representable. > > > > We can return int2 value from the function pg_get_replication_slots() > and then use that to display a string in the view > pg_replication_slots. I strongly dislike that pattern. It just leads to complicated views - and doesn't provide a single benefit that I am aware of. It's much bettter to simply populate the text version in pg_get_replication_slots().
On Thu, Dec 21, 2023 at 07:26:56AM -0800, Andres Freund wrote: > On 2023-12-21 19:55:51 +0530, Amit Kapila wrote: >> We can return int2 value from the function pg_get_replication_slots() >> and then use that to display a string in the view >> pg_replication_slots. > > I strongly dislike that pattern. It just leads to complicated views - and > doesn't provide a single benefit that I am aware of. It's much bettter to > simply populate the text version in pg_get_replication_slots(). I agree that this is a better integration in the view, and that's what I would do FWIW. Amit, how much of a problem would it be to do a text->enum mapping when synchronizing the slots from a primary to a standby? Sure you could have a system function that does some of the mapping work, but I am not sure what's the best integration when it comes to the other patch. -- Michael
Attachment
On Fri, Dec 22, 2023 at 5:00 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Dec 21, 2023 at 07:26:56AM -0800, Andres Freund wrote: > > On 2023-12-21 19:55:51 +0530, Amit Kapila wrote: > >> We can return int2 value from the function pg_get_replication_slots() > >> and then use that to display a string in the view > >> pg_replication_slots. > > > > I strongly dislike that pattern. It just leads to complicated views - and > > doesn't provide a single benefit that I am aware of. It's much bettter to > > simply populate the text version in pg_get_replication_slots(). > > I agree that this is a better integration in the view, and that's what > I would do FWIW. > > Amit, how much of a problem would it be to do a text->enum mapping > when synchronizing the slots from a primary to a standby? > There is no problem as such in that. We were trying to see if there is a more convenient way but let's move by having cause as text from both the function and view as that seems to be a preferred way. -- With Regards, Amit Kapila.
On Thu, Dec 21, 2023 at 8:21 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Thu, Dec 21, 2023 at 07:55:51PM +0530, Amit Kapila wrote: > > On Thu, Dec 21, 2023 at 5:05 PM Andres Freund <andres@anarazel.de> wrote: > > > I'm not entirely sure I understand the difference - just whether we add one > > > new column or replace the existing 'conflicting' column? I can see arguments > > > for either. > > > > > > > Agreed. I think the argument against replacing the existing > > 'conflicting' column is that there is a chance that it is being used > > by some monitoring script which I guess shouldn't be a big deal to > > change. So, if we don't see that as a problem, I would prefer to have > > a single column with conflict reason where one of its values indicates > > there is no conflict. > > +1 > Does anyone else have a preference on whether to change the existing column or add a new one? -- With Regards, Amit Kapila.
On Tue, Dec 26, 2023 at 08:44:44AM +0530, Amit Kapila wrote: > Does anyone else have a preference on whether to change the existing > column or add a new one? Just to be clear here, I'd vote for replacing the existing boolean with a text. -- Michael
Attachment
Hi, On Tue, Dec 26, 2023 at 05:23:56PM +0900, Michael Paquier wrote: > On Tue, Dec 26, 2023 at 08:44:44AM +0530, Amit Kapila wrote: > > Does anyone else have a preference on whether to change the existing > > column or add a new one? > > Just to be clear here, I'd vote for replacing the existing boolean > with a text. Same here, I'd vote to avoid 2 columns having the same "meaning". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, 21 Dec 2023 at 09:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
A conflicting column where NULL indicates no conflict, and other
> values indicate the reason for the conflict, doesn't seem too bad.
>
This is fine too.
I prefer this option. There is precedent for doing it this way, for example in pg_stat_activity.wait_event_type.
The most common test of this field is likely to be "is there a conflict" and it's better to write this as "[fieldname] IS NOT NULL" than to introduce a magic constant. Also, it makes clear to future maintainers that this field has one purpose: saying what type of conflict there is, if any. If we find ourselves wanting to record a new non-conflict status (no idea what that could be: "almost conflict"? "probably conflict soon"?) there would be less temptation to break existing tests for "is there a conflict".
On Tue, Dec 26, 2023 at 7:35 PM Isaac Morland <isaac.morland@gmail.com> wrote: > > On Thu, 21 Dec 2023 at 09:26, Amit Kapila <amit.kapila16@gmail.com> wrote: > >> >> A conflicting column where NULL indicates no conflict, and other >> > values indicate the reason for the conflict, doesn't seem too bad. >> > >> >> This is fine too. > > > I prefer this option. There is precedent for doing it this way, for example in pg_stat_activity.wait_event_type. > > The most common test of this field is likely to be "is there a conflict" and it's better to write this as "[fieldname]IS NOT NULL" than to introduce a magic constant. Also, it makes clear to future maintainers that this field hasone purpose: saying what type of conflict there is, if any. If we find ourselves wanting to record a new non-conflictstatus (no idea what that could be: "almost conflict"? "probably conflict soon"?) there would be less temptationto break existing tests for "is there a conflict". +1 on using "[fieldname] IS NOT NULL" to find "is there a conflict" PFA the patch which attempts to implement this. This patch changes the existing 'conflicting' field to 'conflicting_cause' in pg_replication_slots. This new field is always NULL for physical slots (like the previous field conflicting), as well as for those logical slots which are not invalidated. thanks Shveta
Attachment
On Wed, Dec 27, 2023 at 3:08 PM shveta malik <shveta.malik@gmail.com> wrote: > > PFA the patch which attempts to implement this. > > This patch changes the existing 'conflicting' field to > 'conflicting_cause' in pg_replication_slots. > This name sounds a bit odd to me, would it be better to name it as conflict_cause? A few other minor comments: ========================= * + <structfield>conflicting_cause</structfield> <type>text</type> + </para> + <para> + Cause if this logical slot conflicted with recovery (and so is now + invalidated). It is always NULL for physical slots, as well as for + those logical slots which are not invalidated. Possible values are: Would it better to use description as follows:" Cause of logical slot's conflict with recovery. It is always NULL for physical slots, as well as for logical slots which are not invalidated. The non-NULL values indicate that the slot is marked as invalidated. Possible values are: .." * $res = $node_standby->safe_psql( 'postgres', qq( - select bool_and(conflicting) from pg_replication_slots;)); + select bool_and(conflicting) from + (select conflicting_cause is not NULL as conflicting from pg_replication_slots);)); Won't the query "select conflicting_cause is not NULL as conflicting from pg_replication_slots" can return false even for physical slots and then impact the result of the main query whereas the original query would seem to be simply ignoring physical slots? If this observation is correct then you might want to add a 'slot_type' condition in the new query. * After introducing check_slots_conflicting_cause(), do we need to have check_slots_conflicting_status()? Aren't both checking the same thing? -- With Regards, Amit Kapila.
On Wed, Dec 27, 2023 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Dec 27, 2023 at 3:08 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > PFA the patch which attempts to implement this. > > > > This patch changes the existing 'conflicting' field to > > 'conflicting_cause' in pg_replication_slots. > > > > This name sounds a bit odd to me, would it be better to name it as > conflict_cause? > > A few other minor comments: > ========================= > * > + <structfield>conflicting_cause</structfield> <type>text</type> > + </para> > + <para> > + Cause if this logical slot conflicted with recovery (and so is now > + invalidated). It is always NULL for physical slots, as well as for > + those logical slots which are not invalidated. Possible values are: > > Would it better to use description as follows:" Cause of logical > slot's conflict with recovery. It is always NULL for physical slots, > as well as for logical slots which are not invalidated. The non-NULL > values indicate that the slot is marked as invalidated. Possible > values are: > .." > > * > $res = $node_standby->safe_psql( > 'postgres', qq( > - select bool_and(conflicting) from pg_replication_slots;)); > + select bool_and(conflicting) from > + (select conflicting_cause is not NULL as conflicting from > pg_replication_slots);)); > > Won't the query "select conflicting_cause is not NULL as conflicting > from pg_replication_slots" can return false even for physical slots > and then impact the result of the main query whereas the original > query would seem to be simply ignoring physical slots? If this > observation is correct then you might want to add a 'slot_type' > condition in the new query. > > * After introducing check_slots_conflicting_cause(), do we need to > have check_slots_conflicting_status()? Aren't both checking the same > thing? I think it is needed for the case where we want to check that there is no conflict. # Verify slots are reported as non conflicting in pg_replication_slots check_slots_conflicting_status(0); For the cases where there is conflict, I think check_slots_conflicting_cause() can replace check_slots_conflicting_status(). thanks Shveta
On Thu, Dec 28, 2023 at 10:16 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Dec 27, 2023 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Dec 27, 2023 at 3:08 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > PFA the patch which attempts to implement this. > > > > > > This patch changes the existing 'conflicting' field to > > > 'conflicting_cause' in pg_replication_slots. > > > > > > > This name sounds a bit odd to me, would it be better to name it as > > conflict_cause? > > > > A few other minor comments: > > ========================= Thanks for the feedback Amit. > > * > > + <structfield>conflicting_cause</structfield> <type>text</type> > > + </para> > > + <para> > > + Cause if this logical slot conflicted with recovery (and so is now > > + invalidated). It is always NULL for physical slots, as well as for > > + those logical slots which are not invalidated. Possible values are: > > > > Would it better to use description as follows:" Cause of logical > > slot's conflict with recovery. It is always NULL for physical slots, > > as well as for logical slots which are not invalidated. The non-NULL > > values indicate that the slot is marked as invalidated. Possible > > values are: > > .." > > > > * > > $res = $node_standby->safe_psql( > > 'postgres', qq( > > - select bool_and(conflicting) from pg_replication_slots;)); > > + select bool_and(conflicting) from > > + (select conflicting_cause is not NULL as conflicting from > > pg_replication_slots);)); > > > > Won't the query "select conflicting_cause is not NULL as conflicting > > from pg_replication_slots" can return false even for physical slots > > and then impact the result of the main query whereas the original > > query would seem to be simply ignoring physical slots? If this > > observation is correct then you might want to add a 'slot_type' > > condition in the new query. > > > > * After introducing check_slots_conflicting_cause(), do we need to > > have check_slots_conflicting_status()? Aren't both checking the same > > thing? > > I think it is needed for the case where we want to check that there is > no conflict. > > # Verify slots are reported as non conflicting in pg_replication_slots > check_slots_conflicting_status(0); > > For the cases where there is conflict, I think > check_slots_conflicting_cause() can replace > check_slots_conflicting_status(). I have removed check_slots_conflicting_status() and where it was needed to check non-conflicting, I have added a simple query. PFA the v2-patch with all your comments addressed. thanks Shveta
Attachment
On Thu, Dec 28, 2023 at 2:58 PM shveta malik <shveta.malik@gmail.com> wrote: > > PFA the v2-patch with all your comments addressed. > Does anyone have a preference for a column name? The options on the table are conflict_cause, conflicting_cause, conflict_reason. Any others? I was checking docs for similar usage and found "pg_event_trigger_table_rewrite_reason" function, so based on that we can even go with conflict_reason. -- With Regards, Amit Kapila.
On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote: > Does anyone have a preference for a column name? The options on the > table are conflict_cause, conflicting_cause, conflict_reason. Any > others? I was checking docs for similar usage and found > "pg_event_trigger_table_rewrite_reason" function, so based on that we > can even go with conflict_reason. "conflict_reason" sounds like the natural choice here. -- Michael
Attachment
On Fri, Dec 29, 2023 at 3:35 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote: > > Does anyone have a preference for a column name? The options on the > > table are conflict_cause, conflicting_cause, conflict_reason. Any > > others? I was checking docs for similar usage and found > > "pg_event_trigger_table_rewrite_reason" function, so based on that we > > can even go with conflict_reason. > > "conflict_reason" sounds like the natural choice here. Do we have more comments on the patch apart from column_name? thanks Shveta
On Mon, Jan 1, 2024 at 9:14 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Dec 29, 2023 at 3:35 PM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote: > > > Does anyone have a preference for a column name? The options on the > > > table are conflict_cause, conflicting_cause, conflict_reason. Any > > > others? I was checking docs for similar usage and found > > > "pg_event_trigger_table_rewrite_reason" function, so based on that we > > > can even go with conflict_reason. > > > > "conflict_reason" sounds like the natural choice here. > > Do we have more comments on the patch apart from column_name? > > thanks > Shveta PFA v3 after changing column name to 'conflict_reason' thanks Shveta
Attachment
On Mon, Jan 1, 2024 at 12:32 PM shveta malik <shveta.malik@gmail.com> wrote: > > PFA v3 after changing column name to 'conflict_reason' > Few minor comments: =================== 1. + <para> + <literal>wal_removed</literal> = required WAL has been removed. + </para> + </listitem> + <listitem> + <para> + <literal>rows_removed</literal> = required rows have been removed. + </para> + </listitem> + <listitem> + <para> + <literal>wal_level_insufficient</literal> = wal_level insufficient on the primary server. + </para> Should we use the same style to write the description as we are using for the wal_status column? For example, <literal>wal_removed</literal> means that the required WAL has been removed. 2. + <para> + The reason of logical slot's conflict with recovery. My grammar tool says it should be: "The reason for the logical slot's conflict with recovery." Other than these minor comments, the patch looks good to me. -- With Regards, Amit Kapila.
On Mon, Jan 1, 2024 at 4:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Jan 1, 2024 at 12:32 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > PFA v3 after changing column name to 'conflict_reason' > > > > Few minor comments: > =================== > 1. > + <para> > + <literal>wal_removed</literal> = required WAL has been removed. > + </para> > + </listitem> > + <listitem> > + <para> > + <literal>rows_removed</literal> = required rows have been removed. > + </para> > + </listitem> > + <listitem> > + <para> > + <literal>wal_level_insufficient</literal> = wal_level > insufficient on the primary server. > + </para> > > Should we use the same style to write the description as we are using > for the wal_status column? For example, <literal>wal_removed</literal> > means that the required WAL has been removed. > > 2. > + <para> > + The reason of logical slot's conflict with recovery. > > My grammar tool says it should be: "The reason for the logical slot's > conflict with recovery." > > Other than these minor comments, the patch looks good to me. PFA v4 which addresses the doc comments. thanks Shveta
Attachment
On Mon, Jan 1, 2024 at 5:17 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Jan 1, 2024 at 4:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Jan 1, 2024 at 12:32 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > PFA v3 after changing column name to 'conflict_reason' > > > > > > > Few minor comments: > > =================== > > 1. > > + <para> > > + <literal>wal_removed</literal> = required WAL has been removed. > > + </para> > > + </listitem> > > + <listitem> > > + <para> > > + <literal>rows_removed</literal> = required rows have been removed. > > + </para> > > + </listitem> > > + <listitem> > > + <para> > > + <literal>wal_level_insufficient</literal> = wal_level > > insufficient on the primary server. > > + </para> > > > > Should we use the same style to write the description as we are using > > for the wal_status column? For example, <literal>wal_removed</literal> > > means that the required WAL has been removed. > > > > 2. > > + <para> > > + The reason of logical slot's conflict with recovery. > > > > My grammar tool says it should be: "The reason for the logical slot's > > conflict with recovery." > > > > Other than these minor comments, the patch looks good to me. > > PFA v4 which addresses the doc comments. Please ignore the previous patch and PFA new v4 (v4_2). The only change from the earlier v4 is the subject correction in commit msg. thanks Shveta
Attachment
On Mon, Jan 1, 2024 at 5:24 PM shveta malik <shveta.malik@gmail.com> wrote: > > Please ignore the previous patch and PFA new v4 (v4_2). The only > change from the earlier v4 is the subject correction in commit msg. > The patch looks good to me. I have slightly changed one of the descriptions in the docs and also modified the commit message a bit. I will push this after two days unless there are any more comments/suggestions. -- With Regards, Amit Kapila.
Attachment
Hi, On Tue, Jan 02, 2024 at 10:35:59AM +0530, Amit Kapila wrote: > On Mon, Jan 1, 2024 at 5:24 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Please ignore the previous patch and PFA new v4 (v4_2). The only > > change from the earlier v4 is the subject correction in commit msg. > > Thanks for the patch! > The patch looks good to me. I have slightly changed one of the > descriptions in the docs and also modified the commit message a bit. I > will push this after two days unless there are any more > comments/suggestions. > The patch LGTM, I just have a Nit comment: + <literal>wal_level_insufficient</literal> means that the + <xref linkend="guc-wal-level"/> is insufficient on the primary + server. I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's better to directly mention it is linked to the primary (without the need to refer to the documentation) and that the fact that it is "insufficient" is more or less implicit. Basically I think that with "primary_wal_level" one would need to refer to the doc less frequently than with "wal_level_insufficient". But again, that's a Nit so feel free to ignore. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote: > + <literal>wal_level_insufficient</literal> means that the > + <xref linkend="guc-wal-level"/> is insufficient on the primary > + server. > > I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's > better to directly mention it is linked to the primary (without the need to refer > to the documentation) and that the fact that it is "insufficient" is more or less > implicit. > > Basically I think that with "primary_wal_level" one would need to refer to the doc > less frequently than with "wal_level_insufficient". I can see your point, but wal_level_insufficient speaks a bit more to me because of its relationship with the GUC setting. Something like wal_level_insufficient_on_primary may speak better, but that's also quite long. I'm OK with what the patch does. + as invalidated. Possible values are: + <itemizedlist spacing="compact"> Higher-level nit: indentation seems to be one space off here. -- Michael
Attachment
On Wed, Jan 3, 2024 at 7:10 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote: > > + <literal>wal_level_insufficient</literal> means that the > > + <xref linkend="guc-wal-level"/> is insufficient on the primary > > + server. > > > > I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's > > better to directly mention it is linked to the primary (without the need to refer > > to the documentation) and that the fact that it is "insufficient" is more or less > > implicit. > > > > Basically I think that with "primary_wal_level" one would need to refer to the doc > > less frequently than with "wal_level_insufficient". > > I can see your point, but wal_level_insufficient speaks a bit more to > me because of its relationship with the GUC setting. Something like > wal_level_insufficient_on_primary may speak better, but that's also > quite long. I'm OK with what the patch does. > Thanks, I also prefer "wal_level_insufficient". To me "primary_wal_level" sounds more along the lines of a GUC name than the conflict_reason. The other names that come to mind are "wal_level_lower_than_required", "wal_level_lower", "wal_level_lesser_than_required", "wal_level_lesser" but I feel "wal_level_insufficient" sounds better than these. Having said that, I am open to any of these or better options for this conflict_reason. > + as invalidated. Possible values are: > + <itemizedlist spacing="compact"> > Higher-level nit: indentation seems to be one space off here. > Thanks, fixed in the attached patch. -- With Regards, Amit Kapila.
Attachment
Hi, On Wed, Jan 03, 2024 at 08:53:44AM +0530, Amit Kapila wrote: > On Wed, Jan 3, 2024 at 7:10 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote: > > > + <literal>wal_level_insufficient</literal> means that the > > > + <xref linkend="guc-wal-level"/> is insufficient on the primary > > > + server. > > > > > > I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's > > > better to directly mention it is linked to the primary (without the need to refer > > > to the documentation) and that the fact that it is "insufficient" is more or less > > > implicit. > > > > > > Basically I think that with "primary_wal_level" one would need to refer to the doc > > > less frequently than with "wal_level_insufficient". > > > > I can see your point, but wal_level_insufficient speaks a bit more to > > me because of its relationship with the GUC setting. Something like > > wal_level_insufficient_on_primary may speak better, but that's also > > quite long. I'm OK with what the patch does. > > > > Thanks, I also prefer "wal_level_insufficient". To me > "primary_wal_level" sounds more along the lines of a GUC name than the > conflict_reason. The other names that come to mind are > "wal_level_lower_than_required", "wal_level_lower", > "wal_level_lesser_than_required", "wal_level_lesser" but I feel > "wal_level_insufficient" sounds better than these. Having said that, I > am open to any of these or better options for this conflict_reason. > Thank you both for giving your thoughts on it, I got your points and I'm OK with "wal_level_insufficient". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Wed, 3 Jan 2024 at 08:54, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jan 3, 2024 at 7:10 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Tue, Jan 02, 2024 at 02:07:58PM +0000, Bertrand Drouvot wrote: > > > + <literal>wal_level_insufficient</literal> means that the > > > + <xref linkend="guc-wal-level"/> is insufficient on the primary > > > + server. > > > > > > I'd prefer "primary_wal_level" instead of "wal_level_insufficient". I think it's > > > better to directly mention it is linked to the primary (without the need to refer > > > to the documentation) and that the fact that it is "insufficient" is more or less > > > implicit. > > > > > > Basically I think that with "primary_wal_level" one would need to refer to the doc > > > less frequently than with "wal_level_insufficient". > > > > I can see your point, but wal_level_insufficient speaks a bit more to > > me because of its relationship with the GUC setting. Something like > > wal_level_insufficient_on_primary may speak better, but that's also > > quite long. I'm OK with what the patch does. > > > > Thanks, I also prefer "wal_level_insufficient". To me > "primary_wal_level" sounds more along the lines of a GUC name than the > conflict_reason. The other names that come to mind are > "wal_level_lower_than_required", "wal_level_lower", > "wal_level_lesser_than_required", "wal_level_lesser" but I feel > "wal_level_insufficient" sounds better than these. Having said that, I > am open to any of these or better options for this conflict_reason. > > > + as invalidated. Possible values are: > > + <itemizedlist spacing="compact"> > > Higher-level nit: indentation seems to be one space off here. > > > > Thanks, fixed in the attached patch. I have marked the commitfest entry to the committed state as the patch is committed. Regards, Vignesh