Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
Date
Msg-id 006101ce14b4$36525fc0$a2f71f40$@kapila@huawei.com
Whole thread Raw
In response to Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
List pgsql-hackers
On Wednesday, February 27, 2013 12:41 AM Heikki Linnakangas wrote:
> On 22.02.2013 12:43, Etsuro Fujita wrote:
> >>>>>> 1. "Broken pipe" is not handled in case of psql "\copy" command;
> >>>>>>      Issue are as follows:
> >>>>>>          Following are verified on SuSE-Linux 10.2.
> >>>>>>          1) psql is exiting when "\COPY xxx TO" command is
> issued
> >>>>>> and
> >>>> command/script is not found
> >>>>>>                   When popen is called in write mode it is
> >>>>>> creating valid
> >>>> file descriptor and when it tries to write to file "Broken pipe"
> >>> error
> >>>> is>  coming which is not handled.
> >>>>>>                          psql# \copy pgbench_accounts TO PROGRAM
> >>>> '../compress.sh pgbench_accounts4.txt'
> >>>>>>          2) When "\copy" command is in progress then
> >>> program/command
> >>>>>> is
> >>>> killed/"crashed due to any problem"
> >>>>>>             psql is exiting.
> >>>>
> >>>>> This is a headache.  I have no idea how to solve this.
> >>>>
> >>>> I think we can keep it for committer to take a call on this issue.
> >>>
> >>> Agreed.
> 
> Fortunately this one is easy. We just need to ignore SIGPIPE, by
> calling pqsignal(SIGPIPE, SIG_IGN) before popen(). We already do the
> same in psql when the query output is piped to a pager, in PageOutput.

So are you planning to call the same in do_copy as well; in current patch
it is not there.

> >>>> 5. Following in copy.sgml can be changed to make more meaningful
> as
> >>>> the first line looks little adhoc.
> >>>> +<para>
> >>>> +      The command that input comes from or that output goes to.
> >>>> +      The command for COPY FROM, which input comes from, must
> >>>> +write  its
> >>>> output
> >>>> +      to standard output.  The command for COPY TO, which output
> >>> goes
> >>>> + to,
> >>>> must
> >>>> +      read its input from standard input.
> >>>> +</para>
> >>>
> >>> I've struggled to make the document more meaningful.
> >>
> >> To be honest, I am not sure whether introducing pre, post processor
> >> terminology is right or not, But again I shall let committer decide
> >> about this point.
> >
> > Agreed.
> 
> I changed the terminology to use terms like "the command specified with
> PROGRAM", instead of talking about pre- and post-processsors.
> 
> >> I have one small another doubt that in function parse_slash_copy,
> you
> >> avoided expand tilde for program case, which I am not sure is the
> >> right thing or not.
> >
> > Sorry, I'm not sure that, too.  I'd like to leave this for
> committers.
> 
> That seems correct. The shell will do tilde expansion with popen(), we
> don't need to do it ourselves.
> 
> There's one oddity in the psql \copy syntax. This is actually present
> in earlier versions too, but I think we should fix it now that we
> extend the syntax:
> 
>   -- Writes to standard output. There's no way to write to a file
> called "stdout".
>   \copy foo to 'stdout'
> 
> I think we should change the interpretation of the above so that it
> writes to a file called "stdout". It's possible that there's an
> application out there that relies on that to write to stdout, but it's
> not hard to fix the application. A small note in the release notes
> might be in order.
> 
> Also, I think we should require the command to be quoted in \copy. Ie.
> let's forbid this:
> 
> \copy pgbench_accounts to program /bin/cat>/dev/null
> 
> and require that it's written as:
> 
> \copy pgbench_accounts to program '/bin/cat>/dev/null'
> 
> We don't require quoting for filenames, which I find a bit weird too,
> but it seems particularly confusing for a shell command.
> 
> Attached is a new version of this patch that I almost committed, but
> one thing caught my eye at the last minute: The error reporting is not
> very user-friendly. If the program fails after reading/writing some
> rows, the reason is printed to the log, but not the user:
> 
> postgres=# copy foo to program '/tmp/midway-crash';
> ERROR:  could not execute command "/tmp/midway-crash"
> 
> the log has more details:
> 
> LOG:  child process exited with exit code 10
> STATEMENT:  copy foo to program '/tmp/midway-crash';
> ERROR:  could not execute command "/tmp/midway-crash"
> STATEMENT:  copy foo to program '/tmp/midway-crash';
> 
> I think we should arrange for the "child process exited with exit code
> 10" to be printed as errdetail to the client. Similarly, with psql
> \copy command, the "child process exited with exit code 10" command
> shouldn't be printed straight to stderr, but should go through
> psql_error.
> 
> I'll try to refactor pclose_check tomorrow to do that. Meanwhile,
> please take a look at the attached if you have the time. I rewrote much
> of the docs changes, and some comments.

If there is semicolon at end, it takes it into account for creating
filename.
\copy foo to c:\data\foo.out;
This problem was fixed in previous patch, but I think handling of quotes has
changed this behavior.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: bugfix: --echo-hidden is not supported by \sf statements
Next
From: Miroslav Šimulčík
Date:
Subject: Check if trigger was fired deferred