Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT - Mailing list pgsql-bugs
From | Etsuro Fujita |
---|---|
Subject | Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT |
Date | |
Msg-id | 5BE552ED.4040304@lab.ntt.co.jp Whole thread Raw |
In response to | Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT
Re: BUG #15449: file_fdw using program cause exit code error whenusing LIMIT |
List | pgsql-bugs |
(2018/11/09 14:39), Kyotaro HORIGUCHI wrote: > At Thu, 08 Nov 2018 21:52:31 +0900, Etsuro Fujita<fujita.etsuro@lab.ntt.co.jp> wrote in<5BE4318F.4040002@lab.ntt.co.jp> >> (2018/11/08 10:50), Thomas Munro wrote: >>> I take back what I said earlier about false positives from other >>> pipes. I think it's only traditional Unix programs designed for use >>> in pipelines and naive programs that let SIGPIPE terminate the >>> process. The accepted answer here gives a good way to think about >>> it: >>> >>> https://stackoverflow.com/questions/8369506/why-does-sigpipe-exist >> >> Thanks for the information! >> >>> A program sophisticated enough to be writing to other pipes is no >>> longer in that category and should be setting up signal dispositions >>> itself, so I agree that we should enable the default disposition and >>> ignore WTERMSIG(exit_code) == SIGPIPE, as proposed. That is pretty >>> close to the intended purpose of that signal AFAICS. >> >> Great! >> >>>>> In the sense of "We don't care the reason", negligible reasons >>>>> are necessariry restricted to SIGPIPE, evan SIGSEGV could be >>>>> theoretically ignored safely. "theoretically" here means it is >>>>> another issue whether we rely on the output from a program which >>>>> causes SEGV (or any reason other than SIGPIPE, which we caused). >>>> >>>> For the SIGSEGV case, I think it would be better that we don't rely on >>>> the output data, IMO, because I think there might be a possibility >>>> that >>>> the program have generated that data incorrectly/unexpectedly. >>> >>> +1 >>> >>> I don't think we should ignore termination by signals other than >>> SIGPIPE: that could hide serious problems from users. I want to know >>> if my program is crashing with SIGBUS, SIGTERM, SIGFPE etc, even if it >>> happens after we read enough data; there is a major problem that a >>> human needs to investigate! >> >> I think so too. > > Ok, I can live with that with no problem. OK >>>>> As the result it doesn't report an error for SELECT * FROM ft2 >>>>> LIMIT 1 on "main(void){puts("test1"); return 1;}". >>>>> >>>>> =# select * from ft limit 1; >>>>> a >>>>> ------- >>>>> test1 >>>>> (1 row) >>>>> >>>>> limit 2 reports the error. >>>>> >>>>> =# select * from ft limit 2; >>>>> ERROR: program "/home/horiguti/work/exprog" failed >>>>> DETAIL: child process exited with exit code 1 >>>> >>>> I think this would be contrary to users expectations: if the SELECT >>>> command works for limit 1, they would expect that the command would >>>> work >>>> for limit 2 as well. So, I think it would be better to error out that >>>> command for limit 1 as well, as-is. >>> >>> I think it's correct that LIMIT 1 gives no error but LIMIT 2 gives an >>> error. For LIMIT 1, we got all the rows we wanted, and then we closed >>> the pipe. If we got a non-zero non-signal exit code, or a signal exit >>> code and it was SIGPIPE (not any other signal!), then we should >>> consider that to be expected. >> >> Maybe I'm missing something, but the non-zero non-signal exit code >> means that there was something wrong with the called program, so I >> think a human had better investigate that as well IMO, which would >> probably be a minor problem, though. Too restrictive? > > I think Thomas just saying that reading more lines can develop > problems. According to the current discussion, we should error > out if we had SEGV when limit 1. Ah, I misread that. Sorry for the noise. >>> On Wed, Nov 7, 2018 at 4:44 PM Etsuro Fujita >>> <fujita.etsuro@lab.ntt.co.jp> wrote: >>>> (2018/11/06 19:50), Thomas Munro wrote: >>>>> On my FreeBSD system, I compared the output of procstat -i (= show >>>>> signal disposition) for two "sleep 60" processes, one invoked from the >>>>> shell and the other from COPY ... FROM PROGRAM. The differences were: >>>>> PIPE, TTIN, TTOU and USR2. For the first and last of those, the >>>>> default action would be to terminate the process, but the COPY PROGRAM >>>>> child ignored them; for TTIN and TTOU, the default action would be to >>>>> stop the process, but again they are ignored. Why do bgwriter.c, >>>>> startup.c, ... set SIGTTIN and SIGTTOU back to SIG_DFL, but not >>>>> regular backends? >>>> >>>> So, we should revert SIGUSR2 as well to default processing? >>> >>> I don't think it matters in practice, but it might be nice to restore >>> that just for consistency. >> >> Agreed. >> >>> I'm not sure what to think about the TTIN, >>> TTOU stuff; I don't understand job control well right now but I don't >>> think it really applies to programs run by a PostgreSQL backend, so if >>> we restore those it'd probably again be only for consistency. Then >>> again, there may be a reason someone decided to ignore those in the >>> postmaster + regular backends but not the various auxiliary processes. >>> Anyone? >> >> I don't have any idea about that. > > In my understanding processes not connected to a > terminal(tty/pts) cannot receive TTIN/TTOU (unless someone sent > it artifically). Since child processes are detached by setsid() > (on Linux), programs called in that way also won't have a > controlling terminal at the start time and I suppose they have no > means to connect to one since they are no longer on the same > session with postmaster. For TTIN and TTOU, we would first need to make clear the reason for the inconsistency Thomas pointed out. I'm wondering if we should leave the TTIN/TTOU stuff for future work. Best regards, Etsuro Fujita
pgsql-bugs by date: