Thread: logicalrep_message_type throws an error
Hi All, logicalrep_message_type() is used to convert logical message type code into name while prepared error context or details. Thus when this function is called probably an ERROR is already raised. If logicalrep_message_type() gets an unknown message type, it will throw an error, which will suppress the error for which we are building context or details. That's not useful. I think instead logicalrep_message_type() should return "unknown" when it encounters an unknown message type and let the original error message be thrown as is. -- Best Wishes, Ashutosh Bapat
On Mon, Jul 3, 2023, at 7:30 AM, Ashutosh Bapat wrote:
logicalrep_message_type() is used to convert logical message type codeinto name while prepared error context or details. Thus when thisfunction is called probably an ERROR is already raised. Iflogicalrep_message_type() gets an unknown message type, it will throwan error, which will suppress the error for which we are buildingcontext or details. That's not useful. I think insteadlogicalrep_message_type() should return "unknown" when it encountersan unknown message type and let the original error message be thrownas is.
Hmm. Good catch. The current behavior is:
ERROR: invalid logical replication message type "X"
LOG: background worker "logical replication worker" (PID 71800) exited with exit code 1
... that hides the details. After providing a default message type:
ERROR: invalid logical replication message type "X"
CONTEXT: processing remote data for replication origin "pg_16638" during message type "???" in transaction 796, finished at 0/16266F8
Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it.
Attachment
Thanks Euler for the patch. On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira <euler@eulerto.com> wrote: > > Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it. > A couple of comments. -char * +const char * Nice improvement. logicalrep_message_type(LogicalRepMsgType action) { switch (action) @@ -1256,9 +1256,7 @@ logicalrep_message_type(LogicalRepMsgType action) return "STREAM ABORT"; case LOGICAL_REP_MSG_STREAM_PREPARE: return "STREAM PREPARE"; + default: + return "???"; } - - elog(ERROR, "invalid logical replication message type \"%c\"", action); - - return NULL; /* keep compiler quiet */ The switch is on action which is an enum. So without default it will provide a compilation warning for missing enums. Adding "default" case defeats that purpose. I think we should just return "???" from outside switch block. -- Best Wishes, Ashutosh Bapat
On Mon, Jul 3, 2023 at 6:52 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Thanks Euler for the patch. > > On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira <euler@eulerto.com> wrote: > > > > Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it. > > > > A couple of comments. > > -char * > +const char * > > Nice improvement. > > logicalrep_message_type(LogicalRepMsgType action) > { > switch (action) > @@ -1256,9 +1256,7 @@ logicalrep_message_type(LogicalRepMsgType action) > return "STREAM ABORT"; > case LOGICAL_REP_MSG_STREAM_PREPARE: > return "STREAM PREPARE"; > + default: > + return "???"; > } > - > - elog(ERROR, "invalid logical replication message type \"%c\"", action); > - > - return NULL; /* keep compiler quiet */ > > The switch is on action which is an enum. So without default it will > provide a compilation warning for missing enums. Adding "default" case > defeats that purpose. I think we should just return "???" from outside > switch block. > PFA patch. -- Best Wishes, Ashutosh Bapat
Attachment
On Mon, Jul 3, 2023, at 10:57 AM, Ashutosh Bapat wrote:
On Mon, Jul 3, 2023 at 6:52 PM Ashutosh Bapat<ashutosh.bapat.oss@gmail.com> wrote:>> The switch is on action which is an enum. So without default it will> provide a compilation warning for missing enums. Adding "default" case> defeats that purpose. I think we should just return "???" from outside> switch block.>
Yeah. I don't think any gcc -Wswitch options have effect if a default is used
so your suggestion is good for wrong/missing message types in the future.
PFA patch.
WFM.
On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira <euler@eulerto.com> wrote: > > On Mon, Jul 3, 2023, at 7:30 AM, Ashutosh Bapat wrote: > > logicalrep_message_type() is used to convert logical message type code > into name while prepared error context or details. Thus when this > function is called probably an ERROR is already raised. If > logicalrep_message_type() gets an unknown message type, it will throw > an error, which will suppress the error for which we are building > context or details. That's not useful. I think instead > logicalrep_message_type() should return "unknown" when it encounters > an unknown message type and let the original error message be thrown > as is. > > > Hmm. Good catch. The current behavior is: > > ERROR: invalid logical replication message type "X" > LOG: background worker "logical replication worker" (PID 71800) exited with exit code 1 > > ... that hides the details. After providing a default message type: > > ERROR: invalid logical replication message type "X" > CONTEXT: processing remote data for replication origin "pg_16638" during message type "???" in transaction 796, finishedat 0/16266F8 > I think after returning "???" from logicalrep_message_type(), the above is possible when we get the error: "invalid logical replication message type "X"" from apply_dispatch(), right? If so, then what about the case when we forgot to handle some message in logicalrep_message_type() but handled it in apply_dispatch()? Isn't it better to return the 'action' from the function logicalrep_message_type() for unknown type? That way the information could be a bit better and we can easily catch the code bug as well. -- With Regards, Amit Kapila.
On 2023-Jul-05, Amit Kapila wrote: > I think after returning "???" from logicalrep_message_type(), the > above is possible when we get the error: "invalid logical replication > message type "X"" from apply_dispatch(), right? If so, then what about > the case when we forgot to handle some message in > logicalrep_message_type() but handled it in apply_dispatch()? Isn't it > better to return the 'action' from the function > logicalrep_message_type() for unknown type? That way the information > could be a bit better and we can easily catch the code bug as well. Are you suggesting that logicalrep_message_type should include the numerical value of 'action' in the ??? message? Something like this: ERROR: invalid logical replication message type "X" CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (123)" in transaction 796, finishedat 0/16266F8 I don't see why not -- seems easy enough, and might help somebody. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ Tom: There seems to be something broken here. Teodor: I'm in sackcloth and ashes... Fixed. http://postgr.es/m/482D1632.8010507@sigaev.ru
On Wed, Jul 5, 2023 at 4:26 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2023-Jul-05, Amit Kapila wrote: > > > I think after returning "???" from logicalrep_message_type(), the > > above is possible when we get the error: "invalid logical replication > > message type "X"" from apply_dispatch(), right? If so, then what about > > the case when we forgot to handle some message in > > logicalrep_message_type() but handled it in apply_dispatch()? Isn't it > > better to return the 'action' from the function > > logicalrep_message_type() for unknown type? That way the information > > could be a bit better and we can easily catch the code bug as well. > > Are you suggesting that logicalrep_message_type should include the > numerical value of 'action' in the ??? message? Something like this: > > ERROR: invalid logical replication message type "X" > CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (123)" in transaction 796,finished at 0/16266F8 > Yes, something like that would be better. -- With Regards, Amit Kapila.
On Wed, Jul 5, 2023, at 7:56 AM, Alvaro Herrera wrote:
On 2023-Jul-05, Amit Kapila wrote:> I think after returning "???" from logicalrep_message_type(), the> above is possible when we get the error: "invalid logical replication> message type "X"" from apply_dispatch(), right? If so, then what about> the case when we forgot to handle some message in> logicalrep_message_type() but handled it in apply_dispatch()? Isn't it> better to return the 'action' from the function> logicalrep_message_type() for unknown type? That way the information> could be a bit better and we can easily catch the code bug as well.Are you suggesting that logicalrep_message_type should include thenumerical value of 'action' in the ??? message? Something like this:ERROR: invalid logical replication message type "X"CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (123)" in transaction 796, finished at 0/16266F8
Isn't this numerical value already exposed in the error message (X = 88)?
In this example, it is:
ERROR: invalid logical replication message type "X"
CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796, finished at 0/1626698
IMO it could be confusing if we provide two representations of the same data (X
and 88). Since we already provide "X" I don't think we need "88". Another
option, is to remove "X" from apply_dispatch() and add "??? (88)" to
logicalrep_message_type().
Opinions?
On Wed, Jul 5, 2023 at 7:07 PM Euler Taveira <euler@eulerto.com> wrote: > > On Wed, Jul 5, 2023, at 7:56 AM, Alvaro Herrera wrote: > > On 2023-Jul-05, Amit Kapila wrote: > > > I think after returning "???" from logicalrep_message_type(), the > > above is possible when we get the error: "invalid logical replication > > message type "X"" from apply_dispatch(), right? If so, then what about > > the case when we forgot to handle some message in > > logicalrep_message_type() but handled it in apply_dispatch()? apply_dispatch() has a default case in switch() so it can theoretically forget to handle some message type. I think we should avoid default case in that function to catch missing message type in that function. But if logicalrep_message_type() doesn't use default case, it won't forget to handle a known message type. So what you are suggesting is not possible. It might happen that the upstream may send an unknown message type that both apply_dispatch() and logicalrep_message_type() can not handle. > ERROR: invalid logical replication message type "X" > CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796, finishedat 0/1626698 > > IMO it could be confusing if we provide two representations of the same data (X > and 88). Since we already provide "X" I don't think we need "88". Another > option, is to remove "X" from apply_dispatch() and add "??? (88)" to > logicalrep_message_type(). I think we don't need message type to be mentioned in the context for an error about invalid message type. I think what needs to be done is to set apply_error_callback_arg.command = 0 before calling ereport in the default case of apply_dispatch(). apply_error_callback() will just return without providing a context. If we need a context and have all the other necessary fields, we can improve apply_error_callback() to provide context when apply_error_callback_arg.command = 0 . -- Best Wishes, Ashutosh Bapat
On 2023-Jul-05, Euler Taveira wrote: > Isn't this numerical value already exposed in the error message (X = 88)? > In this example, it is: > > ERROR: invalid logical replication message type "X" > CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796, finishedat 0/1626698 > > IMO it could be confusing if we provide two representations of the same data (X > and 88). Since we already provide "X" I don't think we need "88". Another > option, is to remove "X" from apply_dispatch() and add "??? (88)" to > logicalrep_message_type(). > > Opinions? The CONTEXT message could theoretically appear in other error throws, not just in "invalid logical replication message". So the duplicity is not really a problem. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Now I have my system running, not a byte was off the shelf; It rarely breaks and when it does I fix the code myself. It's stable, clean and elegant, and lightning fast as well, And it doesn't cost a nickel, so Bill Gates can go to hell."
On 2023-Jul-05, Alvaro Herrera wrote: > On 2023-Jul-05, Euler Taveira wrote: > > > Isn't this numerical value already exposed in the error message (X = 88)? > > In this example, it is: > > > > ERROR: invalid logical replication message type "X" > > CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796,finished at 0/1626698 > > > > IMO it could be confusing if we provide two representations of the same data (X > > and 88). Since we already provide "X" I don't think we need "88". > > The CONTEXT message could theoretically appear in other error throws, > not just in "invalid logical replication message". So the duplicity is > not really a problem. Ah, but you're thinking that if the message type is invalid, then it will have been rejected in the "invalid logical replication message" stage, so no invalid message type will be reported. I guess there's a point to that argument as well. However, I don't see that having the numerical ASCII value there causes any harm, even if the char value is already exposed in the other message. (I'm sure you'll agree that this is quite a minor issue.) I doubt that each of these two prints of the enum value showing different formats is confusing. Yes, the enum is defined in terms of char literals, but if an actually invalid message shows up with an uint32 value outside the printable ASCII range, the printout might be ugly or useless. > > Another option, is to remove "X" from apply_dispatch() and add "??? > > (88)" to logicalrep_message_type(). Now *this* would be an actively bad idea IMO. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Wed, Jul 5, 2023 at 10:45 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2023-Jul-05, Alvaro Herrera wrote: > > > On 2023-Jul-05, Euler Taveira wrote: > > > > > Isn't this numerical value already exposed in the error message (X = 88)? > > > In this example, it is: > > > > > > ERROR: invalid logical replication message type "X" > > > CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796,finished at 0/1626698 > > > > > > IMO it could be confusing if we provide two representations of the same data (X > > > and 88). Since we already provide "X" I don't think we need "88". > > > > The CONTEXT message could theoretically appear in other error throws, > > not just in "invalid logical replication message". So the duplicity is > > not really a problem. > > Ah, but you're thinking that if the message type is invalid, then it > will have been rejected in the "invalid logical replication message" > stage, so no invalid message type will be reported. > Yeah, but it would still be displayed both in context and the actual message. > I guess there's a > point to that argument as well. > One point to note is that the user may also get confused if the actual ERROR says message type as 'X' and the context says '???'. I feel in this case duplicate information is better than different information. -- With Regards, Amit Kapila.
On Wed, Jul 5, 2023 at 7:54 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Wed, Jul 5, 2023 at 7:07 PM Euler Taveira <euler@eulerto.com> wrote: > > > > On Wed, Jul 5, 2023, at 7:56 AM, Alvaro Herrera wrote: > > > > On 2023-Jul-05, Amit Kapila wrote: > > > > > I think after returning "???" from logicalrep_message_type(), the > > > above is possible when we get the error: "invalid logical replication > > > message type "X"" from apply_dispatch(), right? If so, then what about > > > the case when we forgot to handle some message in > > > logicalrep_message_type() but handled it in apply_dispatch()? > > apply_dispatch() has a default case in switch() so it can > theoretically forget to handle some message type. I think we should > avoid default case in that function to catch missing message type in > that function. But if logicalrep_message_type() doesn't use default > case, it won't forget to handle a known message type. So what you are > suggesting is not possible. > Right, but still I feel it would be better to return actual action. > It might happen that the upstream may send an unknown message type > that both apply_dispatch() and logicalrep_message_type() can not > handle. > > > ERROR: invalid logical replication message type "X" > > CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796,finished at 0/1626698 > > > > IMO it could be confusing if we provide two representations of the same data (X > > and 88). Since we already provide "X" I don't think we need "88". Another > > option, is to remove "X" from apply_dispatch() and add "??? (88)" to > > logicalrep_message_type(). > > I think we don't need message type to be mentioned in the context for > an error about invalid message type. I think what needs to be done is > to set > apply_error_callback_arg.command = 0 before calling ereport in the > default case of apply_dispatch(). > apply_error_callback() will just return without providing a context. > Hmm, this looks a bit hacky to me. -- With Regards, Amit Kapila.
On Thu, Jul 6, 2023 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > One point to note is that the user may also get confused if the actual > ERROR says message type as 'X' and the context says '???'. I feel in > this case duplicate information is better than different information. I agree. I think it would be better to show the same string like: ERROR: invalid logical replication message type "??? (88)" CONTEXT: processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796, finished at 0/1626698 Since the numerical value is important only in invalid message type cases, how about using a format like "??? (88)" in unknown message type cases, in both error and context messages? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On 2023-Jul-11, Masahiko Sawada wrote: > Since the numerical value is important only in invalid message type > cases, how about using a format like "??? (88)" in unknown message > type cases, in both error and context messages? +1 -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Las navajas y los monos deben estar siempre distantes" (Germán Poo)
On Tue, Jul 11, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Jul 6, 2023 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > One point to note is that the user may also get confused if the actual > > ERROR says message type as 'X' and the context says '???'. I feel in > > this case duplicate information is better than different information. > > I agree. I think it would be better to show the same string like: > > ERROR: invalid logical replication message type "??? (88)" > CONTEXT: processing remote data for replication origin "pg_16638" > during message type "??? (88)" in transaction 796, finished at > 0/1626698 > > Since the numerical value is important only in invalid message type > cases, how about using a format like "??? (88)" in unknown message > type cases, in both error and context messages? > Do you have something like attached in mind? -- With Regards, Amit Kapila.
Attachment
On Sat, Jul 15, 2023, at 4:27 AM, Amit Kapila wrote:
Do you have something like attached in mind?
WFM. I would change the comment that says
This function is called to provide context in the error ...
to
This message provides context in the error ...
because this comment is not at the beginning of the function but *before* the
message.
On Sat, Jul 15, 2023 at 7:16 PM Euler Taveira <euler@eulerto.com> wrote: > > On Sat, Jul 15, 2023, at 4:27 AM, Amit Kapila wrote: > > Do you have something like attached in mind? > > > WFM. I would change the comment that says > > This function is called to provide context in the error ... > > to > > This message provides context in the error ... > > because this comment is not at the beginning of the function but *before* the > message. > Sounds reasonable to me. I'll make this modification before pushing unless there are more comments/suggestions. -- With Regards, Amit Kapila.
On Sat, Jul 15, 2023 at 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Since the numerical value is important only in invalid message type > > cases, how about using a format like "??? (88)" in unknown message > > type cases, in both error and context messages? > > > > Do you have something like attached in mind? Prologue of psprintf() says * Errors are not returned to the caller, but are reported via elog(ERROR) * in the backend, or printf-to-stderr-and-exit() in frontend builds. * One should therefore think twice about using this in libpq. If an error occurs in psprintf(), it will throw an error which will override the original error. I think we should avoid any stuff that throws further errors. -- Best Wishes, Ashutosh Bapat
On 2023-Jul-17, Ashutosh Bapat wrote: > Prologue of psprintf() says > > * Errors are not returned to the caller, but are reported via elog(ERROR) > * in the backend, or printf-to-stderr-and-exit() in frontend builds. > * One should therefore think twice about using this in libpq. > > If an error occurs in psprintf(), it will throw an error which will > override the original error. I think we should avoid any stuff that > throws further errors. Ooh, yeah, this is an excellent point. I agree it would be better to avoid psprintf() here and anything that adds more failure modes. Let's just do the thing in the original patch you submitted, to ensure all these strings are compile-time constants; that's likely the most robust. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Pensar que el espectro que vemos es ilusorio no lo despoja de espanto, sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)
On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2023-Jul-17, Ashutosh Bapat wrote: > > > Prologue of psprintf() says > > > > * Errors are not returned to the caller, but are reported via elog(ERROR) > > * in the backend, or printf-to-stderr-and-exit() in frontend builds. > > * One should therefore think twice about using this in libpq. > > > > If an error occurs in psprintf(), it will throw an error which will > > override the original error. I think we should avoid any stuff that > > throws further errors. > > Ooh, yeah, this is an excellent point. I agree it would be better to > avoid psprintf() here and anything that adds more failure modes. > I have tried to check whether we have such usage in any other error callbacks. Though I haven't scrutinized each and every error callback, I found a few of them where an error can be raised. For example, rm_redo_error_callback()->initStringInfo() CopyFromErrorCallback()->limit_printout_length() shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf() > Let's > just do the thing in the original patch you submitted, to ensure all > these strings are compile-time constants; that's likely the most robust. > So will we be okay with something like the below: ERROR: invalid logical replication message type "??? (88)" CONTEXT: processing remote data for replication origin "pg_16638" during message type "???" in transaction 796, finished at 0/1626698 -- With Regards, Amit Kapila.
On Tue, Jul 18, 2023 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2023-Jul-17, Ashutosh Bapat wrote: > > > > > Prologue of psprintf() says > > > > > > * Errors are not returned to the caller, but are reported via elog(ERROR) > > > * in the backend, or printf-to-stderr-and-exit() in frontend builds. > > > * One should therefore think twice about using this in libpq. > > > > > > If an error occurs in psprintf(), it will throw an error which will > > > override the original error. I think we should avoid any stuff that > > > throws further errors. > > > > Ooh, yeah, this is an excellent point. I agree it would be better to > > avoid psprintf() here and anything that adds more failure modes. > > > > I have tried to check whether we have such usage in any other error > callbacks. Though I haven't scrutinized each and every error callback, > I found a few of them where an error can be raised. For example, > > rm_redo_error_callback()->initStringInfo() > CopyFromErrorCallback()->limit_printout_length() > shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf() > > > Let's > > just do the thing in the original patch you submitted, to ensure all > > these strings are compile-time constants; that's likely the most robust. > > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 + 11] allocated on the stack instead? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, Jul 18, 2023 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > > > > I have tried to check whether we have such usage in any other error > > callbacks. Though I haven't scrutinized each and every error callback, > > I found a few of them where an error can be raised. For example, > > > > rm_redo_error_callback()->initStringInfo() > > CopyFromErrorCallback()->limit_printout_length() > > shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf() > > > > > Let's > > > just do the thing in the original patch you submitted, to ensure all > > > these strings are compile-time constants; that's likely the most robust. > > > > > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 + > 11] allocated on the stack instead? > In the above size calculation, shouldn't it be 7 + 11 where 7 is for (3 (???) + 1 for space + 2 for () + 1 for terminating null char) and 11 is for %d? BTW, this avoids dynamic allocation of the err string in logicalrep_message_type() but we can't return a locally allocated string, so do you think we should change the prototype of the function to get this as an argument and then use it both for valid and invalid cases? I think if there is some simpler way to achieve this then fine, otherwise, let's return a constant string like "???" from logicalrep_message_type() for the invalid action. -- With Regards, Amit Kapila.
On Wed, Jul 19, 2023 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Jul 18, 2023 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > > > > > > > I have tried to check whether we have such usage in any other error > > > callbacks. Though I haven't scrutinized each and every error callback, > > > I found a few of them where an error can be raised. For example, > > > > > > rm_redo_error_callback()->initStringInfo() > > > CopyFromErrorCallback()->limit_printout_length() > > > shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf() > > > > > > > Let's > > > > just do the thing in the original patch you submitted, to ensure all > > > > these strings are compile-time constants; that's likely the most robust. > > > > > > > > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 + > > 11] allocated on the stack instead? > > > > In the above size calculation, shouldn't it be 7 + 11 where 7 is for > (3 (???) + 1 for space + 2 for () + 1 for terminating null char) and > 11 is for %d? BTW, this avoids dynamic allocation of the err string in > logicalrep_message_type() but we can't return a locally allocated > string, so do you think we should change the prototype of the function > to get this as an argument and then use it both for valid and invalid > cases? There are other places in the code which do something similar by using statically allocated buffers like static char xya[SIZE]. We could do that here. The caller may decide whether to pstrdup this buffer further or just use it one time e.g. as an elog or printf argument. As I said before, we should not even print message type in the error context because it's unknown. Repeating that twice is useless. That will need some changes to apply_error_callback() though. But I am fine with "???" as well. -- Best Wishes, Ashutosh Bapat
On Wed, Jul 19, 2023 at 10:08 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Wed, Jul 19, 2023 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 + > > > 11] allocated on the stack instead? > > > > > > > In the above size calculation, shouldn't it be 7 + 11 where 7 is for > > (3 (???) + 1 for space + 2 for () + 1 for terminating null char) and > > 11 is for %d? BTW, this avoids dynamic allocation of the err string in > > logicalrep_message_type() but we can't return a locally allocated > > string, so do you think we should change the prototype of the function > > to get this as an argument and then use it both for valid and invalid > > cases? > > There are other places in the code which do something similar by using > statically allocated buffers like static char xya[SIZE]. We could do > that here. The caller may decide whether to pstrdup this buffer > further or just use it one time e.g. as an elog or printf argument. > Okay, changed it accordingly. Currently, we call it only from errcontext, so it looks reasonable to me to use static here. > As I said before, we should not even print message type in the error > context because it's unknown. Repeating that twice is useless. That > will need some changes to apply_error_callback() though. > But I am fine with "???" as well. > I think in the end it won't make a big difference. So, I would like to go with Sawada-San's suggestion to keep the message type consistent in actual error and error context unless that requires bigger changes. -- With Regards, Amit Kapila.
On Thu, Jul 20, 2023 at 9:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jul 19, 2023 at 10:08 AM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Wed, Jul 19, 2023 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 + > > > > 11] allocated on the stack instead? > > > > > > > > > > In the above size calculation, shouldn't it be 7 + 11 where 7 is for > > > (3 (???) + 1 for space + 2 for () + 1 for terminating null char) and > > > 11 is for %d? BTW, this avoids dynamic allocation of the err string in > > > logicalrep_message_type() but we can't return a locally allocated > > > string, so do you think we should change the prototype of the function > > > to get this as an argument and then use it both for valid and invalid > > > cases? > > > > There are other places in the code which do something similar by using > > statically allocated buffers like static char xya[SIZE]. We could do > > that here. The caller may decide whether to pstrdup this buffer > > further or just use it one time e.g. as an elog or printf argument. > > > > Okay, changed it accordingly. > oops, forgot to attach the patch. -- With Regards, Amit Kapila.
Attachment
On Thu, Jul 20, 2023 at 9:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Okay, changed it accordingly. > > > > oops, forgot to attach the patch. > WFM @@ -3367,7 +3367,7 @@ apply_dispatch(StringInfo s) default: ereport(ERROR, (errcode(ERRCODE_PROTOCOL_VIOLATION), - errmsg("invalid logical replication message type \"%c\"", action))); + errmsg("invalid logical replication message type \"??? (%d)\"", action))); I think we should report character here since that's what is visible in the code and also the message types are communicated as characters not integers. Makes debugging easier. Context may report integer as an additional data point. -- Best Wishes, Ashutosh Bapat
On Fri, Jul 21, 2023 at 1:39 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Thu, Jul 20, 2023 at 9:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Okay, changed it accordingly. > > > > > > > oops, forgot to attach the patch. > > > > WFM > > @@ -3367,7 +3367,7 @@ apply_dispatch(StringInfo s) > default: > ereport(ERROR, > (errcode(ERRCODE_PROTOCOL_VIOLATION), > - errmsg("invalid logical replication message type > \"%c\"", action))); > + errmsg("invalid logical replication message type > \"??? (%d)\"", action))); > > I think we should report character here since that's what is visible > in the code and also the message types are communicated as characters > not integers. Makes debugging easier. Context may report integer as an > additional data point. I think it could confuse us if an invalid message is not a printable character. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, Jul 21, 2023 at 6:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jul 21, 2023 at 1:39 AM Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: > > > > On Thu, Jul 20, 2023 at 9:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > Okay, changed it accordingly. > > > > > > > > > > oops, forgot to attach the patch. > > > > > > > WFM > > > > @@ -3367,7 +3367,7 @@ apply_dispatch(StringInfo s) > > default: > > ereport(ERROR, > > (errcode(ERRCODE_PROTOCOL_VIOLATION), > > - errmsg("invalid logical replication message type > > \"%c\"", action))); > > + errmsg("invalid logical replication message type > > \"??? (%d)\"", action))); > > > > I think we should report character here since that's what is visible > > in the code and also the message types are communicated as characters > > not integers. Makes debugging easier. Context may report integer as an > > additional data point. > > I think it could confuse us if an invalid message is not a printable character. > Right. I'll push and backpatch this till 15 by Tuesday unless you guys think otherwise. -- With Regards, Amit Kapila.
On Sat, Jul 22, 2023 at 10:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > I think it could confuse us if an invalid message is not a printable character. > > > That's a good point. > Right. I'll push and backpatch this till 15 by Tuesday unless you guys > think otherwise. WFM. -- Best Wishes, Ashutosh Bapat
On Mon, Jul 24, 2023 at 6:14 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Sat, Jul 22, 2023 at 10:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Right. I'll push and backpatch this till 15 by Tuesday unless you guys > > think otherwise. > > WFM. > Pushed. -- With Regards, Amit Kapila.