Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT - Mailing list pgsql-bugs

From Kyotaro HORIGUCHI
Subject Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
Date
Msg-id 20181107.092245.117036769.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
List pgsql-bugs
At Tue, 06 Nov 2018 21:07:37 +0900, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote in
<5BE18409.2070004@lab.ntt.co.jp>
> (2018/11/06 12:53), Kyotaro HORIGUCHI wrote:
> > At Fri, 02 Nov 2018 22:05:36 +0900, Etsuro
> > Fujita<fujita.etsuro@lab.ntt.co.jp> wrote
> > in<5BDC4BA0.7050106@lab.ntt.co.jp>
> >> (2018/10/29 15:58), Kyotaro HORIGUCHI wrote:
> >>> At Tue, 23 Oct 2018 13:21:31 +0100, Tom Lane<tgl@sss.pgh.pa.us>  wrote
> >>> in<18397.1540297291@sss.pgh.pa.us>
> >>>> It's just a POC because there are some things that need more thought
> >>>> than I've given them:
> >>>>
> >>>> 1. Is it OK to revert SIGPIPE to default processing for *all* programs
> >>>> launched through OpenPipeStream?  If not, what infrastructure do we
> >>>> need to add to control that?  In particular, is it sane to revert
> >>>> SIGPIPE for a pipe program that we will write to not read from?
> >>>> (I thought probably yes, because that is the normal Unix behavior,
> >>>> but it could be debated.)
> 
> >> ISTM that it would be OK to inherit SIG_DFL in both cases, because I
> >> think it would be the responsibility of the called program to handle
> >> SIGPIPE properly (if necessary) in both cases.  Maybe I'm missing
> >> something, though.
> >
> > So, I think we *should* (not just OK to) restore SIGPIPE to
> > SIG_DFL in any case here to prevent undetermined situation for
> > the program.
> 
> OK
> 
> >>>> 2. Likewise, is it always OK for ClosePipeToProgram to ignore a
> >>>> SIGPIPE failure?  (For ordinary COPY, maybe it shouldn't, since
> >>>> we don't intend to terminate that early.)
> >>>
> >>> Since the SIGPIPE may come from another pipe, I think we
> >>> shouldn't generally.
> >>
> >> Agreed; if ClosePipeToProgram ignores that failure, we would fail to
> >> get a better error message in CopySendEndOfRow if the called program
> >> (inheriting SIG_DFL for SIGPIPE) was terminated on SIGPIPE:
> >>
> >>                  if (cstate->is_program)
> >>                  {
> >>                      if (errno == EPIPE)
> >>                      {
> >>                          /*
> >>                           * The pipe will be closed automatically on error
> >>                           * at
> >>                           * the end of transaction, but we might get a
> >>                           * better
> >>                           * error message from the subprocess' exit code
> >>                           * than
> >>                           * just "Broken Pipe"
> >>                           */
> >>                          ClosePipeToProgram(cstate);
> >>
> >>                          /*
> >>                           * If ClosePipeToProgram() didn't throw an error,
> >>                           * the
> >>                           * program terminated normally, but closed the
> >>                           * pipe
> >>                           * first. Restore errno, and throw an error.
> >>                           */
> >>                          errno = EPIPE;
> >>                      }
> >>                      ereport(ERROR,
> >>                              (errcode_for_file_access(),
> >>                               errmsg("could not write to COPY program:
> >>                               %m")));
> >>                  }
> >
> > Mmm, that's EPIPE of fwrite, not SIGNALLED&SIGPIPE of called
> > program's exit status. So it is irrelevant to called program's
> > SIGPIPE setup. It requires SIGPIPE to be kept to SIG_IGN on the
> > backend side.
> 
> My explanation might not be enough.  Let me explain.  If the called
> program that inherited SIG_DFL for SIGPIPE terminated on SIGPIPE for
> some reason, ClosePipeToProgram *as-is* would create an error message
> from that program's exit code.  But if we modify ClosePipeToProgram
> like the original POC patch, that function would not create that
> message for that termination.  To avoid that, I think it would be
> better for ClosePipeToProgram to ignore the SIGPIPE failure only in
> the case where the caller is a COPY FROM PROGRAM that is allowed to
> terminate early. (Maybe we could specify that as a new argument for
> BeginCopyFrom.)

I understood. Thanks for the explanation. I agree to that.

> >>> But iff we are explicitly stop reading from
> >>> the pipe before detecting an error, it can be ignored since we
> >>> aren't interested in further failure.
> >>
> >> You mean that we should ignore other failures of the called program
> >> when we stop reading from the pipe early?
> >
> > Yes, we have received sufficient data from the pipe then closed
> > it successfully. The program may want write more but we don't
> > want it. We ragher should ignore SIGPIPE exit code of the program
                rather
> > since closing our writing end of a pipe is likely to cause it and
> 
> I think so too.
> 
> > even if it comes from another pipe, we can assume that the
> > SIGPIPE immediately stopped the program before returning any
> > garbage to us.
> 
> Sorry, I don't understand this.

Mmm. It looks confusing, sorry. In other words:

We don't care the reason for program's ending if we received
enough data from the program and we actively (= before receiving
EOF) stop reading. On the other hand if SIGPIPE (on another pipe)
or something fatal happened on the program before we end reading,
we receive EOF just after that and we are interested in the cause
of the program's death.

# Does the above make sense?

In the sense of "We don't care the reason", negligible reasons
are necessariry restricted to SIGPIPE, evan SIGSEGV could be
theoretically ignored safely. "theoretically" here means it is
another issue whether we rely on the output from a program which
causes SEGV (or any reason other than SIGPIPE, which we caused).


> >>>> 4. Are there any other signals we ought to be reverting to default
> >>>> behavior before launching a COPY TO/FROM PROGRAM?
> >>>
> >>> All handlers other than SIG_DEF and IGN are reset on exec().
> >>> Only SIGUSR2 seems to be set to SIG_IGN, but it's not likely to
> >>> harm anything. Perhaps it would be safer against future changes
> >>> if we explicitly reset all changed actions to SIG_DEF, but it
> >>> might be too-much..
> >>
> >> Not sure, but reverting SIGPIPE to default would be enough as a fix
> >> for the original issue, if we go with the POC patch.
> >
> > Agreed. I wonder why there's no API that resets all handlers to
> > SIG_DFL at once or some flag telling to exec() that it should
> > start with default signal handler set.
> 
> Such API would be an improvement, but IMO I think that would go beyond
> the scope of this fix.

Ah, I wanted OS or standard library to provide it. Not us. Anyway
it is not relevant to the topic here.

> > Finally in the attached PoC, I set cstate->eof_reached on failure
> > of NextCopyFromRawFields then if eof_reached we don't ingore
> > SIGPIPE. For a program like
> > "puts("test1");fflush(stdout);kill(getpid(), SIGPIPE);", I had
> > the following behavior. (I got an error without the fflush() but
> > it is inevitable).
> >
> > =# select * from ft2 limit 1;
> >     a
> > -------
> >   test1
> >
> > =# select * from ft2 limit 2;
> > ERROR:  program "/home/horiguti/work/exprog" failed
> > DETAIL:  child process was terminated by signal 13: Broken pipe
> >
> > For the original case:
> >
> > =# select * from ft1 limit 1;
> >    a
> > ------
> >   test
> > =# select * from ft1 limit 2;
> >    a
> > ------
> >   test
> > (1 row)
> >
> >
> > I didn't confirmed whether it is complete.
> 
> Sorry, I don't understand this fully, but the reason to add the
> eof_reached stuff is to avoid ignoring the SIGPIPE failure in normal
> COPY FROM PROGRAM cases?

Yes, if we have received EOF, the program ended before we
finished reading from the pipe thus any error should be
reported. Elsewise we don't care at least SIGPIPE. In the
attached patch all kind of errors are ignored when pipe is closed
from reading-end on backends.

As the result it doesn't report an error for SELECT * FROM ft2
LIMIT 1 on "main(void){puts("test1"); return 1;}".

=#  select * from ft limit 1;
   a   
-------
 test1
(1 row)

limit 2 reports the error.

=# select * from ft limit 2;
ERROR:  program "/home/horiguti/work/exprog" failed
DETAIL:  child process exited with exit code 1

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index b58a74f4e3..c7728f6d6e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -219,6 +219,12 @@ typedef struct CopyStateData
     char       *raw_buf;
     int            raw_buf_index;    /* next byte to process */
     int            raw_buf_len;    /* total # of bytes stored */
+
+    /*
+     * We ignore errors signaled from called proram in ClosePipeToProgram()
+     * when we close our side of pipe before we receive EOF.
+     */
+    bool        eof_reached;
 } CopyStateData;
 
 /* DestReceiver for COPY (query) TO */
@@ -1726,12 +1732,19 @@ ClosePipeToProgram(CopyState cstate)
         ereport(ERROR,
                 (errcode_for_file_access(),
                  errmsg("could not close pipe to external command: %m")));
-    else if (pclose_rc != 0)
+    else if (cstate->eof_reached && pclose_rc != 0)
+    {
+        /*
+         * If the called program terminated on error after we closed the pipe
+         * on our reading end, assume it's OK; we must have chosen to stop
+         * reading its output early.
+         */
         ereport(ERROR,
                 (errcode(ERRCODE_EXTERNAL_ROUTINE_EXCEPTION),
                  errmsg("program \"%s\" failed",
                         cstate->filename),
                  errdetail_internal("%s", wait_result_to_str(pclose_rc))));
+    }
 }
 
 /*
@@ -3525,7 +3538,10 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
         /* read raw fields in the next line */
         if (!NextCopyFromRawFields(cstate, &field_strings, &fldct))
+        {
+            cstate->eof_reached = true;
             return false;
+        }
 
         /* check for overflowing fields */
         if (nfields > 0 && fldct > nfields)
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 8dd51f1767..6cec85e2c1 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -2430,11 +2430,16 @@ OpenTransientFilePerm(const char *fileName, int fileFlags, mode_t fileMode)
  * Routines that want to initiate a pipe stream should use OpenPipeStream
  * rather than plain popen().  This lets fd.c deal with freeing FDs if
  * necessary.  When done, call ClosePipeStream rather than pclose.
+ *
+ * This function also ensures that the popen'd program is run with default
+ * SIGPIPE processing, rather than the SIG_IGN setting the backend normally
+ * uses.  This ensures desirable response to, eg, closing a read pipe early.
  */
 FILE *
 OpenPipeStream(const char *command, const char *mode)
 {
     FILE       *file;
+    int            save_errno;
 
     DO_DB(elog(LOG, "OpenPipeStream: Allocated %d (%s)",
                numAllocatedDescs, command));
@@ -2452,8 +2457,13 @@ OpenPipeStream(const char *command, const char *mode)
 TryAgain:
     fflush(stdout);
     fflush(stderr);
+    pqsignal(SIGPIPE, SIG_DFL);
     errno = 0;
-    if ((file = popen(command, mode)) != NULL)
+    file = popen(command, mode);
+    save_errno = errno;
+    pqsignal(SIGPIPE, SIG_IGN);
+    errno = save_errno;
+    if (file != NULL)
     {
         AllocateDesc *desc = &allocatedDescs[numAllocatedDescs];
 
@@ -2466,8 +2476,6 @@ TryAgain:
 
     if (errno == EMFILE || errno == ENFILE)
     {
-        int            save_errno = errno;
-
         ereport(LOG,
                 (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
                  errmsg("out of file descriptors: %m; release and retry")));

pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #15488: Unexpected behaviour of to_tsverctor and ts_query
Next
From: "M Zav."
Date:
Subject: postgres_fdw