Trap errors from streaming child in pg_basebackup to exit early - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Trap errors from streaming child in pg_basebackup to exit early
Date
Msg-id 0F69E282-97F9-4DB7-8D6D-F927AA6340C8@yesql.se
Whole thread Raw
Responses Re: Trap errors from streaming child in pg_basebackup to exit early
List pgsql-hackers
When using pg_basebackup with WAL streaming (-X stream), we have observed on a
number of times in production that the streaming child exited prematurely (to
no fault of the code it seems, most likely due to network middleboxes), which
cause the backup to fail but only after it has run to completion.  On long
running backups this can consume a lot of time before it’s noticed.

By trapping the failure of the streaming process we can instead exit early to
allow the user to fix and/or restart the process.

The attached adds a SIGCHLD handler for Unix, and catch the returnvalue from
the Windows thread, in order to break out early from the main loop.  It still
needs a test, and proper testing on Windows, but early feedback on the approach
would be appreciated.

--
Daniel Gustafsson        https://vmware.com/


Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: list of acknowledgments for PG14
Next
From: Magnus Hagander
Date:
Subject: Re: cannot access to postgres-git via ssh