Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT - Mailing list pgsql-bugs

From Etsuro Fujita
Subject Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
Date
Msg-id 5BE552ED.4040304@lab.ntt.co.jp
Whole thread Raw
In response to Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
List pgsql-bugs
(2018/11/09 14:39), Kyotaro HORIGUCHI wrote:
> At Thu, 08 Nov 2018 21:52:31 +0900, Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp>  wrote
in<5BE4318F.4040002@lab.ntt.co.jp>
>> (2018/11/08 10:50), Thomas Munro wrote:
>>> I take back what I said earlier about false positives from other
>>> pipes.  I think it's only traditional Unix programs designed for use
>>> in pipelines and naive programs that let SIGPIPE terminate the
>>> process.   The accepted answer here gives a good way to think about
>>> it:
>>>
>>> https://stackoverflow.com/questions/8369506/why-does-sigpipe-exist
>>
>> Thanks for the information!
>>
>>> A program sophisticated enough to be writing to other pipes is no
>>> longer in that category and should be setting up signal dispositions
>>> itself, so I agree that we should enable the default disposition and
>>> ignore WTERMSIG(exit_code) == SIGPIPE, as proposed.  That is pretty
>>> close to the intended purpose of that signal AFAICS.
>>
>> Great!
>>
>>>>> In the sense of "We don't care the reason", negligible reasons
>>>>> are necessariry restricted to SIGPIPE, evan SIGSEGV could be
>>>>> theoretically ignored safely. "theoretically" here means it is
>>>>> another issue whether we rely on the output from a program which
>>>>> causes SEGV (or any reason other than SIGPIPE, which we caused).
>>>>
>>>> For the SIGSEGV case, I think it would be better that we don't rely on
>>>> the output data, IMO, because I think there might be a possibility
>>>> that
>>>> the program have generated that data incorrectly/unexpectedly.
>>>
>>> +1
>>>
>>> I don't think we should ignore termination by signals other than
>>> SIGPIPE: that could hide serious problems from users.  I want to know
>>> if my program is crashing with SIGBUS, SIGTERM, SIGFPE etc, even if it
>>> happens after we read enough data; there is a major problem that a
>>> human needs to investigate!
>>
>> I think so too.
>
> Ok, I can live with that with no problem.

OK

>>>>> As the result it doesn't report an error for SELECT * FROM ft2
>>>>> LIMIT 1 on "main(void){puts("test1"); return 1;}".
>>>>>
>>>>> =#  select * from ft limit 1;
>>>>>       a
>>>>> -------
>>>>>     test1
>>>>> (1 row)
>>>>>
>>>>> limit 2 reports the error.
>>>>>
>>>>> =# select * from ft limit 2;
>>>>> ERROR:  program "/home/horiguti/work/exprog" failed
>>>>> DETAIL:  child process exited with exit code 1
>>>>
>>>> I think this would be contrary to users expectations: if the SELECT
>>>> command works for limit 1, they would expect that the command would
>>>> work
>>>> for limit 2 as well.  So, I think it would be better to error out that
>>>> command for limit 1 as well, as-is.
>>>
>>> I think it's correct that LIMIT 1 gives no error but LIMIT 2 gives an
>>> error.  For LIMIT 1, we got all the rows we wanted, and then we closed
>>> the pipe.  If we got a non-zero non-signal exit code, or a signal exit
>>> code and it was SIGPIPE (not any other signal!), then we should
>>> consider that to be expected.
>>
>> Maybe I'm missing something, but the non-zero non-signal exit code
>> means that there was something wrong with the called program, so I
>> think a human had better investigate that as well IMO, which would
>> probably be a minor problem, though.  Too restrictive?
>
> I think Thomas just saying that reading more lines can develop
> problems. According to the current discussion, we should error
> out if we had SEGV when limit 1.

Ah, I misread that.  Sorry for the noise.

>>> On Wed, Nov 7, 2018 at 4:44 PM Etsuro Fujita
>>> <fujita.etsuro@lab.ntt.co.jp>   wrote:
>>>> (2018/11/06 19:50), Thomas Munro wrote:
>>>>> On my FreeBSD system, I compared the output of procstat -i (= show
>>>>> signal disposition) for two "sleep 60" processes, one invoked from the
>>>>> shell and the other from COPY ... FROM PROGRAM.  The differences were:
>>>>> PIPE, TTIN, TTOU and USR2.  For the first and last of those, the
>>>>> default action would be to terminate the process, but the COPY PROGRAM
>>>>> child ignored them; for TTIN and TTOU, the default action would be to
>>>>> stop the process, but again they are ignored.  Why do bgwriter.c,
>>>>> startup.c, ... set SIGTTIN and SIGTTOU back to SIG_DFL, but not
>>>>> regular backends?
>>>>
>>>> So, we should revert SIGUSR2 as well to default processing?
>>>
>>> I don't think it matters in practice, but it might be nice to restore
>>> that just for consistency.
>>
>> Agreed.
>>
>>> I'm not sure what to think about the TTIN,
>>> TTOU stuff; I don't understand job control well right now but I don't
>>> think it really applies to programs run by a PostgreSQL backend, so if
>>> we restore those it'd probably again be only for consistency.  Then
>>> again, there may be a reason someone decided to ignore those in the
>>> postmaster + regular backends but not the various auxiliary processes.
>>> Anyone?
>>
>> I don't have any idea about that.
>
> In my understanding processes not connected to a
> terminal(tty/pts) cannot receive TTIN/TTOU (unless someone sent
> it artifically).  Since child processes are detached by setsid()
> (on Linux), programs called in that way also won't have a
> controlling terminal at the start time and I suppose they have no
> means to connect to one since they are no longer on the same
> session with postmaster.

For TTIN and TTOU, we would first need to make clear the reason for the 
inconsistency Thomas pointed out.  I'm wondering if we should leave the 
TTIN/TTOU stuff for future work.

Best regards,
Etsuro Fujita


pgsql-bugs by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Tables created WITH OIDS cannot be dumped/restored properly
Next
From: angell2006@ukr.net
Date:
Subject: PostgreSQL error (Segmentation Fault with message "SSL SYSCALLerror: EOF detected")