Re: Suggestion to add --continue-client-on-abort option to pgbench - Mailing list pgsql-hackers

From Rintaro Ikeda
Subject Re: Suggestion to add --continue-client-on-abort option to pgbench
Date
Msg-id d9d70aef-837b-4964-a160-a2cdb0c27845@oss.nttdata.com
Whole thread Raw
In response to Re: Suggestion to add --continue-client-on-abort option to pgbench  (Yugo Nagata <nagata@sraoss.co.jp>)
List pgsql-hackers
Hi,

On 2025/07/10 18:17, Yugo Nagata wrote:
> On Wed, 9 Jul 2025 23:58:32 +0900
> Rintaro Ikeda <ikedarintarof@oss.nttdata.com> wrote:
>
>> Hi,
>>
>> Thank you for the kind comments.
>>
>> I've updated the previous patch.
>
> Thank you for updating the patch!
>
>>> However, if a large number of errors occur, this could result in a significant increase
>>> in stderr output during the benchmark.
>>>
>>> Users can still notice that something went wrong by checking the “number of other failures”
>>> reported after the run, and I assume that in most cases, when --continue-on-error is enabled,
>>> users aren’t particularly interested in seeing individual error messages as they happen.
>>>
>>> It’s true that seeing error messages during the benchmark might be useful in some cases, but
>>> the same could be said for serialization or deadlock errors, and that’s exactly what the
>>> --verbose-errors option is for.
>>
>>
>> I understand your concern. The condition for calling pg_log_error() was modified
>> to reduce stderr output.
>> Additionally, an error message was added for cases where some clients aborted
>> while executing SQL commands, similar to other code paths that transition to
>> st->state = CSTATE_ABORTED, as shown in the example below:
>>
>> ```
>>                         pg_log_error("client %d aborted while establishing connection", st->id);
>>                         st->state = CSTATE_ABORTED;
>> ```
>
>             default:
>                 /* anything else is unexpected */
> -               pg_log_error("client %d script %d aborted in command %d query %d: %s",
> -                            st->id, st->use_file, st->command, qrynum,
> -                            PQerrorMessage(st->con));
> +               if (verbose_errors)
> +                   pg_log_error("client %d script %d aborted in command %d query %d: %s",
> +                                st->id, st->use_file, st->command, qrynum,
> +                                PQerrorMessage(st->con));
>                 goto error;
>         }
>
> Thanks to this fix, error messages caused by SQL errors are now output only when
> --verbose-errors is enable. However, the comment describes the condition as "unexpected",
> and the message states that the client was "aborted". This does not seems accurate, since
> clients are not aborted due to SQL errors when --continue-on-errors is enabled.
>
> I think the error message should be emitted using commandError() when both
> --coneinut-on-errors and --verbose-errors are specified, like this;
>
>             case PGRES_NONFATAL_ERROR:
>             case PGRES_FATAL_ERROR:
>                 st->estatus = getSQLErrorStatus(PQresultErrorField(res,
>                                                                    PG_DIAG_SQLSTATE));
>                 if (continue_on_error || canRetryError(st->estatus))
>                 {
>                     if (verbose_errors)
>                         commandError(st, PQerrorMessage(st->con));
>                     goto error;
>                 }
>                 /* fall through */
>
> In addition, the error message in the "default" case should be shown regardless
> of the --verbose-errors since it represents an unexpected situation and should
> always reported.
>
> Finally, I believe this fix should be included in patch 0001 rather than 0003,
> as it would be a part of the implementation of --continiue-on-error.
>
>
> As of 0003:
>
> +               {
> +                   pg_log_error("client %d aborted while executing SQL commands", st->id);
>                     st->state = CSTATE_ABORTED;
> +               }
>                 break;
>
> I understand that the patch is not directly related to --continue-on-error, similar to 0002,
> and that it aims to improve the error message to indicate that the client was aborted due to
> some error during readCommandResponse().
>
> However, this message doesn't seem entirely accurate, since the error is not always caused
> by an SQL command failure itself. For example, it could also be due to a failure of the \gset
> meta-command.
>
> In addition, this fix causes error messages to be emitted twice. For example, if \gset fails,
> the following similar messages are printed:
>
>  pgbench: error: client 0 script 0 command 0 query 0: expected one row, got 0
>  pgbench: error: client 0 aborted while executing SQL commands
>
> Even worse, if an unexpected error occurs in readCommandResponse() (i.e., the default case),
> the following messages are emitted, both indicating that the client was aborted;
>
>  pgbench: error: client 0 script 0 aborted in command ... query ...
>  pgbench: error: client 0 aborted while executing SQL commands
>
> I feel this is a bit redundant.
>
> Therefore, if we are to improve these messages to indicate explicitly that the client
> was aborted, I would suggest modifying the error messages in readCommandResponse() rather
> than adding a new one in advanceConnectionState().
>
> I've attached patch 0003 incorporating my suggestion. What do you think?

Thank you very much for the updated patch!

I reviewed 0003 and it looks great - the error message become easier to understand.

I noticed one small thing I’d like to discuss. I'm not sure that users clearly
understand which aborted in the following error message, the client or the script.
> pgbench: error: client 0 script 0 aborted in command ... query ...

Since the code path always results in a client abort, I wonder if the following
message might be clearer:
> pgbench: error: client 0 aborted in script 0 command ... query ...


Regards,
Rintaro Ikeda




pgsql-hackers by date:

Previous
From: Sergey Fukanchik
Date:
Subject: Re: [PATCH] replace float8 with int in date2isoweek() and date2isoyear()
Next
From: Marcos Pegoraro
Date:
Subject: Re: Bug on drop extension dependencies ?