Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT - Mailing list pgsql-bugs
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT |
Date | |
Msg-id | 20181106.125304.193575333.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>) |
Responses |
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT |
List | pgsql-bugs |
At Fri, 02 Nov 2018 22:05:36 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in <5BDC4BA0.7050106@lab.ntt.co.jp> > (2018/10/29 15:58), Kyotaro HORIGUCHI wrote: > > At Tue, 23 Oct 2018 13:21:31 +0100, Tom Lane<tgl@sss.pgh.pa.us> wrote > > in<18397.1540297291@sss.pgh.pa.us> > >> After a bit of thought, the problem here is blindingly obvious: > >> we generally run the backend with SIGPIPE handing set to SIG_IGN, > >> and evidently popen() allows the called program to inherit that, > >> at least on these platforms. > >> > >> So what we need to do is make sure the called program inherits SIG_DFL > >> handling for SIGPIPE, and then special-case that result as not being > >> a failure. The attached POC patch does that and successfully makes > >> the file_fdw problem go away for me. > > Thanks for working on this! > > >> It's just a POC because there are some things that need more thought > >> than I've given them: > >> > >> 1. Is it OK to revert SIGPIPE to default processing for *all* programs > >> launched through OpenPipeStream? If not, what infrastructure do we > >> need to add to control that? In particular, is it sane to revert > >> SIGPIPE for a pipe program that we will write to not read from? > >> (I thought probably yes, because that is the normal Unix behavior, > >> but it could be debated.) > > > > Forcing to ignore SIGPIPE lets the program to handle EPIPE, which > > might not be handled poperly since it would be unexpected by the > > program. > > > > If we don't read from the pipe (that is, we are writing to it), > > anyway we shouldn't change the behavior since SIGPIPE can come > > from another pipe. > > I'm a bit confused here. Horiguchi-san, are you saying that the > called program that we will read from should inherit SIG_DFL for > SIGPIPE, as proposed in the POC patch, but the called program that we > will write to should inherit SIG_IGN as it currently does? Maybe yes. The understanding of the first paragram looks right. But in my second paragraph, I said that we shouldn't set SIGPIPE to other than SIG_DFL at exec() time even if we are to read the pipe because it can be writing to another pipe and the change can affect the program's behavior. > ISTM that it would be OK to inherit SIG_DFL in both cases, because I > think it would be the responsibility of the called program to handle > SIGPIPE properly (if necessary) in both cases. Maybe I'm missing > something, though. So, I think we *should* (not just OK to) restore SIGPIPE to SIG_DFL in any case here to prevent undetermined situation for the program. > >> 2. Likewise, is it always OK for ClosePipeToProgram to ignore a > >> SIGPIPE failure? (For ordinary COPY, maybe it shouldn't, since > >> we don't intend to terminate that early.) > > > > Since the SIGPIPE may come from another pipe, I think we > > shouldn't generally. > > Agreed; if ClosePipeToProgram ignores that failure, we would fail to > get a better error message in CopySendEndOfRow if the called program > (inheriting SIG_DFL for SIGPIPE) was terminated on SIGPIPE: > > if (cstate->is_program) > { > if (errno == EPIPE) > { > /* > * The pipe will be closed automatically on error at > * the end of transaction, but we might get a better > * error message from the subprocess' exit code than > * just "Broken Pipe" > */ > ClosePipeToProgram(cstate); > > /* > * If ClosePipeToProgram() didn't throw an error, the > * program terminated normally, but closed the pipe > * first. Restore errno, and throw an error. > */ > errno = EPIPE; > } > ereport(ERROR, > (errcode_for_file_access(), > errmsg("could not write to COPY program: %m"))); > } Mmm, that's EPIPE of fwrite, not SIGNALLED&SIGPIPE of called program's exit status. So it is irrelevant to called program's SIGPIPE setup. It requires SIGPIPE to be kept to SIG_IGN on the backend side. > > But iff we are explicitly stop reading from > > the pipe before detecting an error, it can be ignored since we > > aren't interested in further failure. > > You mean that we should ignore other failures of the called program > when we stop reading from the pipe early? Yes, we have received sufficient data from the pipe then closed it successfully. The program may want write more but we don't want it. We ragher should ignore SIGPIPE exit code of the program since closing our writing end of a pipe is likely to cause it and even if it comes from another pipe, we can assume that the SIGPIPE immediately stopped the program before returning any garbage to us. > > Thus ClosePipeToProgram > > might need an extra parameter or CopyState may need an additional > > flag named, say, "explicit_close" (or > > "ignore_sigpipe_on_pclose"..) in CopyState which will be set in > > EndCopy. > > > >> 3. Maybe this should be implemented at some higher level? > > > > I'm not sure, but it seems to me not needed. > > ISTM that it's a good idea to control the ClosePipeToProgram behavior > by a new member for CopyState. > > >> 4. Are there any other signals we ought to be reverting to default > >> behavior before launching a COPY TO/FROM PROGRAM? > > > > All handlers other than SIG_DEF and IGN are reset on exec(). > > Only SIGUSR2 seems to be set to SIG_IGN, but it's not likely to > > harm anything. Perhaps it would be safer against future changes > > if we explicitly reset all changed actions to SIG_DEF, but it > > might be too-much.. > > Not sure, but reverting SIGPIPE to default would be enough as a fix > for the original issue, if we go with the POC patch. Agreed. I wonder why there's no API that resets all handlers to SIG_DFL at once or some flag telling to exec() that it should start with default signal handler set. I tried that but found that fread() doesn't detect termination of called program by a signal as an error. So just set a flag in EndCopyFrom() ends with concealing of the error. Finally in the attached PoC, I set cstate->eof_reached on failure of NextCopyFromRawFields then if eof_reached we don't ingore SIGPIPE. For a program like "puts("test1");fflush(stdout);kill(getpid(), SIGPIPE);", I had the following behavior. (I got an error without the fflush() but it is inevitable). =# select * from ft2 limit 1; a ------- test1 =# select * from ft2 limit 2; ERROR: program "/home/horiguti/work/exprog" failed DETAIL: child process was terminated by signal 13: Broken pipe For the original case: =# select * from ft1 limit 1; a ------ test =# select * from ft1 limit 2; a ------ test (1 row) I didn't confirmed whether it is complete. regards. -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index b58a74f4e3..1cff0a4221 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -219,6 +219,12 @@ typedef struct CopyStateData char *raw_buf; int raw_buf_index; /* next byte to process */ int raw_buf_len; /* total # of bytes stored */ + + /* + * We ignore SIGPIPE of the called proram in ClosePipeToProgram() when we + * don't see an EOF yet from the reading end of a pipe to a program. + */ + bool eof_reached; } CopyStateData; /* DestReceiver for COPY (query) TO */ @@ -1727,11 +1733,21 @@ ClosePipeToProgram(CopyState cstate) (errcode_for_file_access(), errmsg("could not close pipe to external command: %m"))); else if (pclose_rc != 0) + { + /* + * If the called program terminated on SIGPIPE, assume it's OK; we + * must have chosen to stop reading its output early. + */ + if (!cstate->eof_reached && + WIFSIGNALED(pclose_rc) && WTERMSIG(pclose_rc) == SIGPIPE) + return; + ereport(ERROR, (errcode(ERRCODE_EXTERNAL_ROUTINE_EXCEPTION), errmsg("program \"%s\" failed", cstate->filename), errdetail_internal("%s", wait_result_to_str(pclose_rc)))); + } } /* @@ -3525,7 +3541,10 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext, /* read raw fields in the next line */ if (!NextCopyFromRawFields(cstate, &field_strings, &fldct)) + { + cstate->eof_reached = true; return false; + } /* check for overflowing fields */ if (nfields > 0 && fldct > nfields) diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c index 8dd51f1767..6cec85e2c1 100644 --- a/src/backend/storage/file/fd.c +++ b/src/backend/storage/file/fd.c @@ -2430,11 +2430,16 @@ OpenTransientFilePerm(const char *fileName, int fileFlags, mode_t fileMode) * Routines that want to initiate a pipe stream should use OpenPipeStream * rather than plain popen(). This lets fd.c deal with freeing FDs if * necessary. When done, call ClosePipeStream rather than pclose. + * + * This function also ensures that the popen'd program is run with default + * SIGPIPE processing, rather than the SIG_IGN setting the backend normally + * uses. This ensures desirable response to, eg, closing a read pipe early. */ FILE * OpenPipeStream(const char *command, const char *mode) { FILE *file; + int save_errno; DO_DB(elog(LOG, "OpenPipeStream: Allocated %d (%s)", numAllocatedDescs, command)); @@ -2452,8 +2457,13 @@ OpenPipeStream(const char *command, const char *mode) TryAgain: fflush(stdout); fflush(stderr); + pqsignal(SIGPIPE, SIG_DFL); errno = 0; - if ((file = popen(command, mode)) != NULL) + file = popen(command, mode); + save_errno = errno; + pqsignal(SIGPIPE, SIG_IGN); + errno = save_errno; + if (file != NULL) { AllocateDesc *desc = &allocatedDescs[numAllocatedDescs]; @@ -2466,8 +2476,6 @@ TryAgain: if (errno == EMFILE || errno == ENFILE) { - int save_errno = errno; - ereport(LOG, (errcode(ERRCODE_INSUFFICIENT_RESOURCES), errmsg("out of file descriptors: %m; release and retry")));
pgsql-bugs by date: