Re: [HACKERS] pg_basebackup behavior on non-existent slot - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: [HACKERS] pg_basebackup behavior on non-existent slot
Date
Msg-id CAMkU=1xuF8mT30P+3CCx9iMpGpQgxSQpCDx2vgb-YFcHkhxNEw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] pg_basebackup behavior on non-existent slot  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: [HACKERS] pg_basebackup behavior on non-existent slot
List pgsql-hackers
On Wed, Sep 6, 2017 at 2:50 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Magnus Hagander wrote:
> On Mon, Sep 4, 2017 at 3:21 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

> > Should the parent process of pg_basebackup be made to respond to SIGCHLD?
> > Or call waitpid(bgchild, &status, WNOHANG) in some strategic loop?
>
> I think it's ok to just call waitpid() -- we don't need to react super
> quickly, but we should react.

Hmm, not sure about that ... in the normal case (slotname is correct)
you'd be doing thousands of useless waitpid() system calls during the
whole operation, no?  I think it'd be better to have a SIGCHLD handler
that sets a flag (just once), which can be quickly checked without
accessing kernel space.

If we don't want polling by waitpid, then my next thought would be to move the data copy into another process, then have the main process do nothing but wait for the first child to exit.  If the first to exit is the WAL receiver, then we must have an error and the data receiver can be killed.  I don't know how to translate that to Windows, however.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Emre Hasegeli
Date:
Subject: Re: [HACKERS] [PATCH] Improve geometric types
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] domain type smashing is expensive