Re: Suggestion to add --continue-client-on-abort option to pgbench - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Suggestion to add --continue-client-on-abort option to pgbench
Date
Msg-id CAHGQGwHSUuMr0ieKv++EreK_k=VGK9_aMLVqWMNbVosdd6vsxQ@mail.gmail.com
Whole thread Raw
In response to Re: Suggestion to add --continue-client-on-abort option to pgbench  (Rintaro Ikeda <ikedarintarof@oss.nttdata.com>)
List pgsql-hackers
On Sun, Oct 19, 2025 at 10:12 PM Rintaro Ikeda
<ikedarintarof@oss.nttdata.com> wrote:
>
> Hi,
>
> On 2025/10/02 1:22, Fujii Masao wrote:
> > Regarding 0002:
> >
> > - if (canRetryError(st->estatus))
> > + if (continue_on_error || canRetryError(st->estatus))
> >   {
> >   if (verbose_errors)
> >   commandError(st, PQresultErrorMessage(res));
> >   goto error;
> >
> > With this change, even non-SQL errors (e.g., connection failures) would
> > satisfy the condition when --continue-on-error is set. Isn't that a problem?
> > Shouldn't we also check that the error status is one that
> > --continue-on-error is meant to handle?
>
> I agree that connection failures should not be ignored even when
> --continue-on-error is specified.
> For now, I’m not sure if other cases would cause issues, so the updated patch
> explicitly checks the connection status and emits an error message when the
> connection is lost.

I agree that connection failures should prevent further processing even with
--continue-on-error, and pgbench should focus on handling that first.
However, the patch doesn't seem to handle cases where the connection is
terminated by an admin (e.g., via pg_terminate_backend()) correctly.
Please see the following test case, which is the same one I shared earlier:

-----------------------------------------
$ cat pipeline.sql
\startpipeline
DO $$
  BEGIN
    PERFORM pg_sleep(3);
    PERFORM pg_terminate_backend(pg_backend_pid());
  END $$;
\endpipeline

$ pgbench -n -f pipeline.sql -c 2 -t 4 -M extended --continue-on-error
-----------------------------------------

In this case, PQstatus() (added in readCommandResponse() by the patch)
still returns CONNECTION_OK (BTW, the SQLSTATE is 57P01 in this case).
As a result, the expected error message like “client ... script ... aborted
in command ...” isn't reported. So the PQstatus() check alone that
the patch added doesn't fully fix the issue.

Regards,

--
Fujii Masao



pgsql-hackers by date:

Previous
From: Mihail Nikalayeu
Date:
Subject: Re: Fix race condition in SSI when reading PredXact->SxactGlobalXmin
Next
From: Sami Imseih
Date:
Subject: Re: Skip unregistered custom kinds on stats load