Parallel pg_dump's error reporting doesn't work worth squat - Mailing list pgsql-hackers

From Tom Lane
Subject Parallel pg_dump's error reporting doesn't work worth squat
Date
Msg-id 2458.1450894615@sss.pgh.pa.us
Whole thread Raw
Responses Re: Parallel pg_dump's error reporting doesn't work worth squat
Re: Parallel pg_dump's error reporting doesn't work worth squat
List pgsql-hackers
I was in process of testing the proposed patch for bug #13727,
and I found that at least on my Linux box, this is the behavior
in the failure case without the patch:

$ pg_dump "postgres://postgres:phonypassword@localhost/regression" --jobs=9 -Fd -f testdump
$ echo $?
141
$ ls testdump
toc.dat

That is, the pg_dump process has crashed with a SIGPIPE without printing
any message whatsoever, and without coming anywhere near finishing the
dump.

A bit of investigation says that this is because somebody had the bright
idea that worker processes could report fatal errors back to the master
process instead of just printing them to stderr.  So when the workers
fail to establish connections (because of the password problem cited in
#13727), they don't tell me about that.  Oh no, they send those errors
back up to the pipe to the parent, and then die silently.  Meanwhile,
the parent is trying to send them commands, and since it doesn't protect
itself against SIGPIPE on the command pipes, it crashes without ever
reporting anything.  If you aren't paying close attention, you wouldn't
even realize you didn't get a completed dump.

Depending on timing, this scheme might accidentally fail to fail, but it
seems fragile as can be.  I would bet that it's prone to deadlocks, quite
aside from the SIGPIPE problem.  Considering how amazingly ugly the
underlying code is (exit_horribly is in parallel.c now? Really?), I want
to rip it out entirely, not try to band-aid it by suppressing SIGPIPE ---
though likely we need to do that too.

Thoughts?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Re: [POC] FETCH limited by bytes.
Next
From: Robert Haas
Date:
Subject: Re: SET SESSION AUTHORIZATION superuser limitation.