Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy |
Date | |
Msg-id | 512D08DD.8050109@vmware.com Whole thread Raw |
In response to | Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy ("Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp>) |
Responses |
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
|
List | pgsql-hackers |
On 22.02.2013 12:43, Etsuro Fujita wrote: >>>>>> 1. "Broken pipe" is not handled in case of psql "\copy" command; >>>>>> Issue are as follows: >>>>>> Following are verified on SuSE-Linux 10.2. >>>>>> 1) psql is exiting when "\COPY xxx TO" command is issued >>>>>> and >>>> command/script is not found >>>>>> When popen is called in write mode it is creating >>>>>> valid >>>> file descriptor and when it tries to write to file "Broken pipe" >>> error >>>> is> coming which is not handled. >>>>>> psql# \copy pgbench_accounts TO PROGRAM >>>> '../compress.sh pgbench_accounts4.txt' >>>>>> 2) When "\copy" command is in progress then >>> program/command >>>>>> is >>>> killed/"crashed due to any problem" >>>>>> psql is exiting. >>>> >>>>> This is a headache. I have no idea how to solve this. >>>> >>>> I think we can keep it for committer to take a call on this issue. >>> >>> Agreed. Fortunately this one is easy. We just need to ignore SIGPIPE, by calling pqsignal(SIGPIPE, SIG_IGN) before popen(). We already do the same in psql when the query output is piped to a pager, in PageOutput. >>>> 5. Following in copy.sgml can be changed to make more meaningful as >>>> the first line looks little adhoc. >>>> +<para> >>>> + The command that input comes from or that output goes to. >>>> + The command for COPY FROM, which input comes from, must write >>>> + its >>>> output >>>> + to standard output. The command for COPY TO, which output >>> goes >>>> + to, >>>> must >>>> + read its input from standard input. >>>> +</para> >>> >>> I've struggled to make the document more meaningful. >> >> To be honest, I am not sure whether introducing pre, post processor >> terminology is right or not, >> But again I shall let committer decide about this point. > > Agreed. I changed the terminology to use terms like "the command specified with PROGRAM", instead of talking about pre- and post-processsors. >> I have one small another doubt that in function parse_slash_copy, you >> avoided expand tilde >> for program case, which I am not sure is the right thing or not. > > Sorry, I'm not sure that, too. I'd like to leave this for committers. That seems correct. The shell will do tilde expansion with popen(), we don't need to do it ourselves. There's one oddity in the psql \copy syntax. This is actually present in earlier versions too, but I think we should fix it now that we extend the syntax: -- Writes to standard output. There's no way to write to a file called "stdout". \copy foo to 'stdout' I think we should change the interpretation of the above so that it writes to a file called "stdout". It's possible that there's an application out there that relies on that to write to stdout, but it's not hard to fix the application. A small note in the release notes might be in order. Also, I think we should require the command to be quoted in \copy. Ie. let's forbid this: \copy pgbench_accounts to program /bin/cat>/dev/null and require that it's written as: \copy pgbench_accounts to program '/bin/cat>/dev/null' We don't require quoting for filenames, which I find a bit weird too, but it seems particularly confusing for a shell command. Attached is a new version of this patch that I almost committed, but one thing caught my eye at the last minute: The error reporting is not very user-friendly. If the program fails after reading/writing some rows, the reason is printed to the log, but not the user: postgres=# copy foo to program '/tmp/midway-crash'; ERROR: could not execute command "/tmp/midway-crash" the log has more details: LOG: child process exited with exit code 10 STATEMENT: copy foo to program '/tmp/midway-crash'; ERROR: could not execute command "/tmp/midway-crash" STATEMENT: copy foo to program '/tmp/midway-crash'; I think we should arrange for the "child process exited with exit code 10" to be printed as errdetail to the client. Similarly, with psql \copy command, the "child process exited with exit code 10" command shouldn't be printed straight to stderr, but should go through psql_error. I'll try to refactor pclose_check tomorrow to do that. Meanwhile, please take a look at the attached if you have the time. I rewrote much of the docs changes, and some comments. - Heikki
Attachment
pgsql-hackers by date: