Home > mailing lists

Re: Trap errors from streaming child in pg_basebackup to exit early - Mailing list pgsql-hackers

From	Daniel Gustafsson
Subject	Re: Trap errors from streaming child in pg_basebackup to exit early
Date	September 3, 2021 12:53:01
Msg-id	AC3D81D5-766E-4894-B429-912F8257BE9E@yesql.se Whole thread Raw
In response to	Re: Trap errors from streaming child in pg_basebackup to exit early (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses	Re: Trap errors from streaming child in pg_basebackup to exit early Re: Trap errors from streaming child in pg_basebackup to exit early
List	pgsql-hackers

Tree view

> On 1 Sep 2021, at 12:28, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Wed, Sep 1, 2021 at 1:56 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>> A v2 with the above fixes is attached.
>
> Thanks for the updated patch. Here are some comments:
>
> 1) Do we need to set bgchild = -1 before the exit(1); in the code
> below so that we don't kill(bgchild, SIGTERM); unnecessarily in
> kill_bgchild_atexit?

Good point. We can also inspect bgchild_exited in kill_bgchild_atexit.

> 2) Missing "," after "On Windows, we use a ....."
> + * that time. On Windows we use a background thread which can communicate
>
> 3) How about "/* Flag to indicate whether or not child process exited
> */" instead of +/* State of child process */?

Fixed.

> 4) Instead of just exiting from the main pg_basebackup process when
> the child WAL receiver dies, can't we think of restarting the child
> process, probably with the WAL streaming position where it left off or
> stream from the beginning? This way, the work that the main
> pg_basebackup has done so far doesn't get wasted. I'm not sure if this
> affects the pg_basebackup functionality. We can restart the child
> process for 1 or 2 times, if it still dies, we can kill the main
> pg_baasebackup process too. Thoughts?

I was toying with the idea, but I ended up not pursuing it.  This error is well
into the “really shouldn’t happen, but can” territory and it’s quite likely
that some level of manual intervention is required to make it successfully
restart.  I’m not convinced that adding complicated logic to restart (and even
more complicated tests to simulate and test it) will be worthwhile.

--
Daniel Gustafsson        https://vmware.com/

Attachment

v3-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch

pgsql-hackers by date:

From: Amit Kapila
Date: 03 September 2021, 12:42:32
Subject: Re: Added schema level support for publication.

From: Ronan Dunklau
Date: 03 September 2021, 12:58:27
Subject: Re: pg_receivewal starting position

Re: Trap errors from streaming child in pg_basebackup to exit early - Mailing list pgsql-hackers

Attachment

Previous

Next