Thread: Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots
Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots
From
Peter Smith
Date:
Hi Nisha, Some review comments for the patch v1-0002. ====== src/backend/replication/slot.c 1. + +/* + * Raise an error based on the slot's invalidation cause. + */ +static void +RaiseSlotInvalidationError(ReplicationSlot *slot) +{ + StringInfo err_detail = makeStringInfo(); + + Assert(slot->data.invalidated != RS_INVAL_NONE); + + switch (slot->data.invalidated) + { + case RS_INVAL_WAL_REMOVED: + appendStringInfoString(err_detail, _("This slot has been invalidated because the required WAL has been removed.")); + break; + + case RS_INVAL_HORIZON: + appendStringInfoString(err_detail, _("This slot has been invalidated because the required rows have been removed.")); + break; + + case RS_INVAL_WAL_LEVEL: + /* translator: %s is a GUC variable name */ + appendStringInfo(err_detail, _("This slot has been invalidated because \"%s\" is insufficient for slot."), + "wal_level"); + break; + + case RS_INVAL_NONE: + pg_unreachable(); + } + + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("can no longer get changes from replication slot \"%s\"", + NameStr(slot->data.name)), + errdetail_internal("%s", err_detail->data)); +} AFAIK the _() is a macro for gettext(). So those strings are intended for translation, right? There's also a "/* translator: ..." comment implying the same. OTOH, those strings are only being used by errdetail_internal, whose function comment says "This is exactly like errdetail() except that strings passed to errdetail_internal are not translated...". Isn't there some contradiction here? Perhaps the _() macro is not needed, or perhaps the errdetail_internal() should be errdetail(). ====== Kind Regards, Peter Smith. Fujitsu Australia
Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots
From
Amit Kapila
Date:
On Wed, Jan 29, 2025 at 8:55 AM Peter Smith <smithpb2250@gmail.com> wrote: > > ====== > src/backend/replication/slot.c > > 1. > + > +/* > + * Raise an error based on the slot's invalidation cause. > + */ > +static void > +RaiseSlotInvalidationError(ReplicationSlot *slot) > +{ > + StringInfo err_detail = makeStringInfo(); > + > + Assert(slot->data.invalidated != RS_INVAL_NONE); > + > + switch (slot->data.invalidated) > + { > + case RS_INVAL_WAL_REMOVED: > + appendStringInfoString(err_detail, _("This slot has been invalidated > because the required WAL has been removed.")); > + break; > + > + case RS_INVAL_HORIZON: > + appendStringInfoString(err_detail, _("This slot has been invalidated > because the required rows have been removed.")); > + break; > + > + case RS_INVAL_WAL_LEVEL: > + /* translator: %s is a GUC variable name */ > + appendStringInfo(err_detail, _("This slot has been invalidated > because \"%s\" is insufficient for slot."), > + "wal_level"); > + break; > + > + case RS_INVAL_NONE: > + pg_unreachable(); > + } > + > + ereport(ERROR, > + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("can no longer get changes from replication slot \"%s\"", > + NameStr(slot->data.name)), > + errdetail_internal("%s", err_detail->data)); > +} > > AFAIK the _() is a macro for gettext(). So those strings are intended > for translation, right? There's also a "/* translator: ..." comment > implying the same. > > OTOH, those strings are only being used by errdetail_internal, whose > function comment says "This is exactly like errdetail() except that > strings passed to errdetail_internal are not translated...". > > Isn't there some contradiction here? > Yeah, I also think so but I see some other similar usages. For example, parse_one_reloption() uses errdetail_internal("%s", _(optenum->detailmsg)) : 0)); There are various other places in the code where _( is used for messages in errmsg_internal. > Perhaps the _() macro is not needed, or perhaps the > errdetail_internal() should be errdetail(). > These messages should be with errdetail. We already have these messages being displayed as errdetail in CreateDecodingContext(). Also, after this patch, shouldn't we remove the same error cases in CreateDecodingContext(). There is already an Assert(MyReplicationSlot->data.invalidated == RS_INVAL_NONE) which should be sufficient after this patch to ensure we didn't miss any error cases. -- With Regards, Amit Kapila.
Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots
From
Amit Kapila
Date:
On Wed, Jan 29, 2025 at 3:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > + ereport(ERROR, + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("can no longer get changes from replication slot \"%s\"", + NameStr(slot->data.name)), + errdetail_internal("%s", err_detail->data)); ERROR message appears to be specific to logical slots, it will occur for physical slots as well, so I suggest changing from "can no longer get changes from replication slot" to "can no longer access from replication slot" The errdetail should be used as we have in ReplicationSlotAlter. errdetail("This replication slot has been invalidated due to \"%s\".", SlotInvalidationCauses[MyReplicationSlot->data.invalidated])); You have given the following reason in another thread for not giving a message during acquire in ReplicationSlotAlter: "Because ReplicationSlotAlter() already handles errors immediately after acquiring the slot. It raises errors for invalidated slots and also raises a different error message if the slot is a physical one. So, In case of ALTER, I feel it is okay to acquire the slot first without raising errors and then handle errors in the pre-defined way. Similar immediate error handling is not available at other places." It is better to unify the error in one place rather than at different places. One benefit is error message contents will be the same which is currently not the case. BTW, in the commit message, you only mention the place where we get the message without a patch during logical replication, it is better to mention the timing for physical replication as well. -- With Regards, Amit Kapila.