Thread: Review : Add hooks for pre- and post-processor executables for COPY and \copy

Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
Amit Kapila
Date:
<div class="WordSection1"><p class="MsoNormal"><tt><b><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">Basicstuff:</span></b></tt><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""><br/><tt><span
style="font-family:"Arial","sans-serif"">------------</span></tt><br/><tt><span
style="font-family:"Arial","sans-serif"">       - Rebase of Patch is required.</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       - Compiles cleanly without any errors/warnings</span></tt><br
/><tt><spanstyle="font-family:"Arial","sans-serif"">        - Regression tests pass.</span></tt><br /><br
/><tt><b><spanstyle="font-family:"Arial","sans-serif"">What it does:</span></b></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">---------------------</span></tt><br/><tt><span
style="font-family:"Arial","sans-serif"">       This patch is useful when COPY command input/output are stored in
compressionformat or in any command/script uses these output/input in any means; without generating intermediate
temporaryfiles.</span></tt><br /><tt><span style="font-family:"Arial","sans-serif"">        This feature can be used in
serverside using "COPY statement" by administrator. Or can be used in psql internal "\copy" command by any
user.</span></tt><br/><br /><br /><tt><b><span style="font-family:"Arial","sans-serif"">Code Review
comments:</span></b></tt><br/><tt><span style="font-family:"Arial","sans-serif"">---------------------</span></tt><br
/>    <br/><tt><span style="font-family:"Arial","sans-serif"">1. Modify the comment in function header of:
parse_slash_copy</span></tt>(needs to modify for new syntax)<br /><tt><span style="font-family:"Arial","sans-serif"">2.
Commentsfor functions OpenPipeStream & ClosePipeStream are missing. <b>        </b></span></tt></span><p
class="MsoNormal"><tt><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif"">3. Any Script errors are not
directlyvisible to user; If there problems in script no way to cleanup.</span></tt><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""><tt><span
style="font-family:"Arial","sans-serif"">  </span></tt></span><pclass="MsoNormal"><tt><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">  Shouldn’t this be mentioned in User
Manual.</span></tt><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""><br /><br /><tt><b><span
style="font-family:"Arial","sans-serif"">Testcase issues:</span></b></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">------------------</span></tt><br/><tt><span
style="font-family:"Arial","sans-serif"">1."Broken pipe" is not handled in case of psql "\copy" command;</span></tt><br
/><tt><spanstyle="font-family:"Arial","sans-serif"">    Issue are as follows:</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       Following are verified on SuSE-Linux 10.2.</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       1) psql is exiting when "\COPY xxx TO" command is issued and
command/scriptis not found</span></tt> </span><p class="MsoNormal"><tt><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">                When popen is called in write mode it is
creatingvalid file descriptor and when it tries to write to file "Broken pipe" error is coming which is not handled.
</span></tt><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""><br /><tt><span
style="font-family:"Arial","sans-serif"">                       psql# \copy pgbench_accounts TO PROGRAM '../compress.sh
pgbench_accounts4.txt'</span></tt><br/><tt><span style="font-family:"Arial","sans-serif"">        2) When "\copy"
commandis in progress then program/command is killed/"crashed due to any problem" </span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">          psql is exiting.</span></tt><br /><br /><tt><b><span
style="font-family:"Arial","sans-serif"">Scriptused in testcases:</span></b></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">------------------</span></tt><br/><tt><span
style="font-family:"Arial","sans-serif"">1.compress.sh</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       echo 'cat > $1' > compress.sh</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       echo 'bzip2 -z $1' >> compress.sh</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       chmod +x compress.sh</span></tt><br /><br /><tt><span
style="font-family:"Arial","sans-serif"">2.decompress.sh</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       echo 'bzip2 -d -c -k $*' > decompress.sh</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">       chmod +x decompress.sh</span></tt><br /><br /></span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">Testcasesexecuted are attached with this mail.</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">WithRegards,</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">AmitKapila.</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></div>

Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
"Etsuro Fujita"
Date:
<div class="WordSection1"><p class="MsoNormal"><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D">HiAmit,</span><p class="MsoNormal"><span
lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D"> </span><p class="MsoNormal"><span
lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D">Thank you for your review.  I’ve
rebasedand updated the patch.  </span><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif"">Pleasefind attached the patch.</span><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D"></span><pclass="MsoNormal"><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif""><br/><tt><b><span
style="font-family:"Arial","sans-serif"">>Code Review comments:</span></b></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">>---------------------</span></tt><br />>     <br /><tt><span
style="font-family:"Arial","sans-serif"">>1. Modify the comment in function header of: parse_slash_copy</span></tt>
(needsto modify for new syntax)<br /><br /><tt><span style="font-family:"Arial","sans-serif""></span></tt></span><p
class="MsoNormal"><tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Done.</span></tt><p
class="MsoNormal"><tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></tt><p
class="MsoNormal"><tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">> 2. Comments for
functionsOpenPipeStream & ClosePipeStream are missing. <b>        </b></span></tt><p class="MsoNormal"><tt><span
lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></tt><p class="MsoNormal"><tt><span
lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif"">Done.</span></tt><p class="MsoNormal"><tt><span
lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></tt><p class="MsoNormal"><tt><span
lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif"">> 3. Any Script errors are not directly
visibleto user; If there problems in script no way to cleanup.</span></tt><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif""><tt><span
style="font-family:"Arial","sans-serif"">  </span></tt></span><pclass="MsoNormal"><tt><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif"">>   Shouldn’t this be mentioned in User
Manual.</span></tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D"></span><p
class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p
class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Done.  Please see the
documentationnote on the \copy instruction in psql-ref.sgml.</span><p class="MsoNormal"><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif""><br/><tt><b><span
style="font-family:"Arial","sans-serif"">>Test case issues:</span></b></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">>------------------</span></tt><br /><tt><span
style="font-family:"Arial","sans-serif"">>1. "Broken pipe" is not handled in case of psql "\copy"
command;</span></tt><br/><tt><span style="font-family:"Arial","sans-serif"">>     Issue are as
follows:</span></tt><br/><tt><span style="font-family:"Arial","sans-serif"">>         Following are verified on
SuSE-Linux10.2.</span></tt><br /><tt><span style="font-family:"Arial","sans-serif"">>         1) psql is exiting
when"\COPY xxx TO" command is issued and command/script is not found</span></tt> </span><span lang="EN-US"></span><p
class="MsoNormal"style="margin-bottom:12.0pt"><tt><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif"">>                When popen is called in write mode it is
creatingvalid file descriptor and when it tries to write to file "Broken pipe" error is > coming which is not
handled.</span></tt><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""><br /><tt><span
style="font-family:"Arial","sans-serif"">>                        psql# \copy pgbench_accounts TO PROGRAM
'../compress.shpgbench_accounts4.txt'</span></tt><br /><tt><span style="font-family:"Arial","sans-serif"">>        
2)When "\copy" command is in progress then program/command is killed/"crashed due to any problem" </span></tt><br
/><tt><spanstyle="font-family:"Arial","sans-serif"">>            psql is exiting.</span></tt></span><p
class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">This is a headache.  I
haveno idea how to solve this.</span><p class="MsoNormal"><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span lang="EN-US"
style="font-size:10.0pt;font-family:"Arial","sans-serif"">Sorryfor the long delay in responding.</span><p
class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p
class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Best regards,</span><p
class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Etsuro
Fujita</span></div>
On Wednesday, January 23, 2013 5:36 PM Etsuro Fujita wrote:
> Hi Amit,

> Thank you for your review.  I’ve rebased and updated the patch.  Please
find attached the patch.

>> Test case issues:
>> ------------------
>> 1. "Broken pipe" is not handled in case of psql "\copy" command;
>>     Issue are as follows:
>>         Following are verified on SuSE-Linux 10.2.
>>         1) psql is exiting when "\COPY xxx TO" command is issued and
command/script is not found
>>                 When popen is called in write mode it is creating valid
file descriptor and when it tries to write to file "Broken pipe" error is >
coming which is not handled.
>>                         psql# \copy pgbench_accounts TO PROGRAM
'../compress.sh pgbench_accounts4.txt'
>>         2) When "\copy" command is in progress then program/command is
killed/"crashed due to any problem"
>>            psql is exiting.

>This is a headache.  I have no idea how to solve this.

I think we can keep it for committer to take a call on this issue.
The other changes done by you in revised patch are fine.

I have found few more minor issues as below:

1. The comment above do_copy can be modified to address the new
functionality it can handle.
/* * Execute a \copy command (frontend copy). We have to open a file, then * submit a COPY query to the backend and
eitherfeed it data from the * file or route its response into the file. */  
bool
do_copy(const char *args)


2.
@@ -256,8 +273,14 @@ do_copy(const char *args)
+        if (options->file == NULL && options->program)
+        {
+                psql_error("program is not supported to stdout/pstdout or
from stdin/pstdin\n");
+                return false;
+        }

should call free_copy_options(options); before return false;

3. \copy command doesn't need semicolon at end, however it was working
previous to your patch, but   now it is giving error.
postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt';
e:/pg_git_code/Data/t1_Data.txt';: No such file or directory
e:/pg_git_code/Data/t1_Data.txt';: No such file or directory

4. Please check if OpenPipeStream() it needs to call   if (ReleaseLruFile()),
5. Following in copy.sgml can be changed to make more meaningful as the
first line looks little adhoc.
+     <para>
+      The command that input comes from or that output goes to.
+      The command for COPY FROM, which input comes from, must write its
output
+      to standard output.  The command for COPY TO, which output goes to,
must
+      read its input from standard input.
+     </para>


6. Can we have one example of this new syntax, it can make it more
meaningful.
With Regards,
Amit Kapila.




Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
"Etsuro Fujita"
Date:
Hi Amit,

Thank you for the review.

> From: Amit Kapila [mailto:amit.kapila@huawei.com]

> >> Test case issues:
> >> ------------------
> >> 1. "Broken pipe" is not handled in case of psql "\copy" command;
> >>     Issue are as follows:
> >>         Following are verified on SuSE-Linux 10.2.
> >>         1) psql is exiting when "\COPY xxx TO" command is issued and
> command/script is not found
> >>                 When popen is called in write mode it is creating valid
> file descriptor and when it tries to write to file "Broken pipe" error is >
> coming which is not handled.
> >>                         psql# \copy pgbench_accounts TO PROGRAM
> '../compress.sh pgbench_accounts4.txt'
> >>         2) When "\copy" command is in progress then program/command is
> killed/"crashed due to any problem"
> >>            psql is exiting.
>
> >This is a headache.  I have no idea how to solve this.
>
> I think we can keep it for committer to take a call on this issue.

Agreed.

> I have found few more minor issues as below:
>
> 1. The comment above do_copy can be modified to address the new
> functionality it can handle.
> /*
>  * Execute a \copy command (frontend copy). We have to open a file, then
>  * submit a COPY query to the backend and either feed it data from the
>  * file or route its response into the file.
>  */
> bool
> do_copy(const char *args)

Done.

> 2.
> @@ -256,8 +273,14 @@ do_copy(const char *args)
> +        if (options->file == NULL && options->program)
> +        {
> +                psql_error("program is not supported to stdout/pstdout or
> from stdin/pstdin\n");
> +                return false;
> +        }
>
> should call free_copy_options(options); before return false;

Good catch!  Done.

> 3. \copy command doesn't need semicolon at end, however it was working
> previous to your patch, but
>    now it is giving error.
> postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt';
> e:/pg_git_code/Data/t1_Data.txt';: No such file or directory
> e:/pg_git_code/Data/t1_Data.txt';: No such file or directory

Sorry, I've fixed the bug.

> 4. Please check if OpenPipeStream() it needs to call
>    if (ReleaseLruFile()),

OpenPipeStream() calls ReleaseLruFile() by itself if necessary.

> 5. Following in copy.sgml can be changed to make more meaningful as the
> first line looks little adhoc.
> +     <para>
> +      The command that input comes from or that output goes to.
> +      The command for COPY FROM, which input comes from, must write its
> output
> +      to standard output.  The command for COPY TO, which output goes to,
> must
> +      read its input from standard input.
> +     </para>

I've struggled to make the document more meaningful.

> 6. Can we have one example of this new syntax, it can make it more
> meaningful.

Done.

Sorry for the long delay.

Best regards,
Etsuro Fujita

> With Regards,
> Amit Kapila.
>
>


Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
"Etsuro Fujita"
Date:
Hi Amit,

Thank you for your careful review!

> -----Original Message-----
> From: Amit Kapila [mailto:amit.kapila@huawei.com]
> Sent: Friday, February 22, 2013 7:18 PM
> To: 'Etsuro Fujita'; 'pgsql-hackers'
> Subject: RE: [HACKERS] Review : Add hooks for pre- and post-processor
> executables for COPY and \copy
>
> On Wednesday, February 20, 2013 5:25 PM Etsuro Fujita wrote:
> > Hi Amit,
> >
> > Thank you for the review.
>
> Etsuro-san, you are welcome.
>
> > > From: Amit Kapila [mailto:amit.kapila@huawei.com]
> >
> > > >> Test case issues:
> > > >> ------------------
> > > >> 1. "Broken pipe" is not handled in case of psql "\copy" command;
> > > >>     Issue are as follows:
> > > >>         Following are verified on SuSE-Linux 10.2.
> > > >>         1) psql is exiting when "\COPY xxx TO" command is issued
> > > >> and
> > > command/script is not found
> > > >>                 When popen is called in write mode it is creating
> > > >>valid
> > > file descriptor and when it tries to write to file "Broken pipe"
> > error
> > > is > coming which is not handled.
> > > >>                         psql# \copy pgbench_accounts TO PROGRAM
> > > '../compress.sh pgbench_accounts4.txt'
> > > >>         2) When "\copy" command is in progress then
> > program/command
> > > >> is
> > > killed/"crashed due to any problem"
> > > >>            psql is exiting.
> > >
> > > >This is a headache.  I have no idea how to solve this.
> > >
> > > I think we can keep it for committer to take a call on this issue.
> >
> > Agreed.
> >
> > > I have found few more minor issues as below:
> > >
> > > 1. The comment above do_copy can be modified to address the new
> > > functionality it can handle.
> > > /*
> > >  * Execute a \copy command (frontend copy). We have to open a file,
> > > then
> > >  * submit a COPY query to the backend and either feed it data from
> > the
> > >  * file or route its response into the file.
> > >  */
> > > bool
> > > do_copy(const char *args)
> >
> > Done.
> >
> > > 2.
> > > @@ -256,8 +273,14 @@ do_copy(const char *args)
> > > +        if (options->file == NULL && options->program)
> > > +        {
> > > +                psql_error("program is not supported to
> > > + stdout/pstdout or
> > > from stdin/pstdin\n");
> > > +                return false;
> > > +        }
> > >
> > > should call free_copy_options(options); before return false;
> >
> > Good catch!  Done.
> >
> > > 3. \copy command doesn't need semicolon at end, however it was
> > working
> > > previous to your patch, but
> > >    now it is giving error.
> > > postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt';
> > > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory
> > > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory
> >
> > Sorry, I've fixed the bug.
> >
> > > 4. Please check if OpenPipeStream() it needs to call
> > >    if (ReleaseLruFile()),
> >
> > OpenPipeStream() calls ReleaseLruFile() by itself if necessary.
>
> I have asked this thinking that ReleaseLruFile() may not be useful for
> OpenPipeStream,
> As I was not sure how the new file descriptors get allocated for popen.
> But now again reading popen specs, I got the point that it can be useful.
>
> > > 5. Following in copy.sgml can be changed to make more meaningful as
> > > the first line looks little adhoc.
> > > +     <para>
> > > +      The command that input comes from or that output goes to.
> > > +      The command for COPY FROM, which input comes from, must write
> > > + its
> > > output
> > > +      to standard output.  The command for COPY TO, which output
> > goes
> > > + to,
> > > must
> > > +      read its input from standard input.
> > > +     </para>
> >
> > I've struggled to make the document more meaningful.
>
> To be honest, I am not sure whether introducing pre, post processor
> terminology is right or not,
> But again I shall let committer decide about this point.

Agreed.

> > > 6. Can we have one example of this new syntax, it can make it more
> > > meaningful.
> >
> > Done.
> >
> > Sorry for the long delay.
>
> All the reported issues are handled in the new patch.
>
> I have one small another doubt that in function parse_slash_copy, you
> avoided expand tilde
> for program case, which I am not sure is the right thing or not.

Sorry, I'm not sure that, too.  I'd like to leave this for committers.

> I am marking this patch as Ready For Committer.

Thanks!

Best regards,
Etsuro Fujita

> Notes For Committer
> -----------------------
> 1. "Broken pipe" is not handled in case of psql "\copy" command;
>    This is currently documented
> 2. Documentation needs to be checked, especially with focus whether
> introducing pre, post processor terminology is
>    Okay.
> 3. In function parse_slash_copy, expand tilde is avaoided, is it okay?
>
>
> With Regards,
> Amit Kapila.




On Wednesday, February 20, 2013 5:25 PM Etsuro Fujita wrote:
> Hi Amit,
>
> Thank you for the review.

Etsuro-san, you are welcome.
> > From: Amit Kapila [mailto:amit.kapila@huawei.com]
>
> > >> Test case issues:
> > >> ------------------
> > >> 1. "Broken pipe" is not handled in case of psql "\copy" command;
> > >>     Issue are as follows:
> > >>         Following are verified on SuSE-Linux 10.2.
> > >>         1) psql is exiting when "\COPY xxx TO" command is issued
> > >> and
> > command/script is not found
> > >>                 When popen is called in write mode it is creating
> > >>valid
> > file descriptor and when it tries to write to file "Broken pipe"
> error
> > is > coming which is not handled.
> > >>                         psql# \copy pgbench_accounts TO PROGRAM
> > '../compress.sh pgbench_accounts4.txt'
> > >>         2) When "\copy" command is in progress then
> program/command
> > >> is
> > killed/"crashed due to any problem"
> > >>            psql is exiting.
> >
> > >This is a headache.  I have no idea how to solve this.
> >
> > I think we can keep it for committer to take a call on this issue.
>
> Agreed.
>
> > I have found few more minor issues as below:
> >
> > 1. The comment above do_copy can be modified to address the new
> > functionality it can handle.
> > /*
> >  * Execute a \copy command (frontend copy). We have to open a file,
> > then
> >  * submit a COPY query to the backend and either feed it data from
> the
> >  * file or route its response into the file.
> >  */
> > bool
> > do_copy(const char *args)
>
> Done.
>
> > 2.
> > @@ -256,8 +273,14 @@ do_copy(const char *args)
> > +        if (options->file == NULL && options->program)
> > +        {
> > +                psql_error("program is not supported to
> > + stdout/pstdout or
> > from stdin/pstdin\n");
> > +                return false;
> > +        }
> >
> > should call free_copy_options(options); before return false;
>
> Good catch!  Done.
>
> > 3. \copy command doesn't need semicolon at end, however it was
> working
> > previous to your patch, but
> >    now it is giving error.
> > postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt';
> > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory
> > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory
>
> Sorry, I've fixed the bug.
>
> > 4. Please check if OpenPipeStream() it needs to call
> >    if (ReleaseLruFile()),
>
> OpenPipeStream() calls ReleaseLruFile() by itself if necessary.

I have asked this thinking that ReleaseLruFile() may not be useful for
OpenPipeStream,
As I was not sure how the new file descriptors get allocated for popen.
But now again reading popen specs, I got the point that it can be useful.

> > 5. Following in copy.sgml can be changed to make more meaningful as
> > the first line looks little adhoc.
> > +     <para>
> > +      The command that input comes from or that output goes to.
> > +      The command for COPY FROM, which input comes from, must write
> > + its
> > output
> > +      to standard output.  The command for COPY TO, which output
> goes
> > + to,
> > must
> > +      read its input from standard input.
> > +     </para>
>
> I've struggled to make the document more meaningful.

To be honest, I am not sure whether introducing pre, post processor
terminology is right or not,
But again I shall let committer decide about this point.

> > 6. Can we have one example of this new syntax, it can make it more
> > meaningful.
>
> Done.
>
> Sorry for the long delay.

All the reported issues are handled in the new patch.

I have one small another doubt that in function parse_slash_copy, you
avoided expand tilde
for program case, which I am not sure is the right thing or not.


I am marking this patch as Ready For Committer.


Notes For Committer
-----------------------
1. "Broken pipe" is not handled in case of psql "\copy" command;  This is currently documented
2. Documentation needs to be checked, especially with focus whether
introducing pre, post processor terminology is  Okay.
3. In function parse_slash_copy, expand tilde is avaoided, is it okay?


With Regards,
Amit Kapila.




Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
Heikki Linnakangas
Date:
On 22.02.2013 12:43, Etsuro Fujita wrote:
>>>>>> 1. "Broken pipe" is not handled in case of psql "\copy" command;
>>>>>>      Issue are as follows:
>>>>>>          Following are verified on SuSE-Linux 10.2.
>>>>>>          1) psql is exiting when "\COPY xxx TO" command is issued
>>>>>> and
>>>> command/script is not found
>>>>>>                   When popen is called in write mode it is creating
>>>>>> valid
>>>> file descriptor and when it tries to write to file "Broken pipe"
>>> error
>>>> is>  coming which is not handled.
>>>>>>                          psql# \copy pgbench_accounts TO PROGRAM
>>>> '../compress.sh pgbench_accounts4.txt'
>>>>>>          2) When "\copy" command is in progress then
>>> program/command
>>>>>> is
>>>> killed/"crashed due to any problem"
>>>>>>             psql is exiting.
>>>>
>>>>> This is a headache.  I have no idea how to solve this.
>>>>
>>>> I think we can keep it for committer to take a call on this issue.
>>>
>>> Agreed.

Fortunately this one is easy. We just need to ignore SIGPIPE, by calling
pqsignal(SIGPIPE, SIG_IGN) before popen(). We already do the same in
psql when the query output is piped to a pager, in PageOutput.

>>>> 5. Following in copy.sgml can be changed to make more meaningful as
>>>> the first line looks little adhoc.
>>>> +<para>
>>>> +      The command that input comes from or that output goes to.
>>>> +      The command for COPY FROM, which input comes from, must write
>>>> + its
>>>> output
>>>> +      to standard output.  The command for COPY TO, which output
>>> goes
>>>> + to,
>>>> must
>>>> +      read its input from standard input.
>>>> +</para>
>>>
>>> I've struggled to make the document more meaningful.
>>
>> To be honest, I am not sure whether introducing pre, post processor
>> terminology is right or not,
>> But again I shall let committer decide about this point.
>
> Agreed.

I changed the terminology to use terms like "the command specified with
PROGRAM", instead of talking about pre- and post-processsors.

>> I have one small another doubt that in function parse_slash_copy, you
>> avoided expand tilde
>> for program case, which I am not sure is the right thing or not.
>
> Sorry, I'm not sure that, too.  I'd like to leave this for committers.

That seems correct. The shell will do tilde expansion with popen(), we
don't need to do it ourselves.

There's one oddity in the psql \copy syntax. This is actually present in
earlier versions too, but I think we should fix it now that we extend
the syntax:

  -- Writes to standard output. There's no way to write to a file called
"stdout".
  \copy foo to 'stdout'

I think we should change the interpretation of the above so that it
writes to a file called "stdout". It's possible that there's an
application out there that relies on that to write to stdout, but it's
not hard to fix the application. A small note in the release notes might
be in order.

Also, I think we should require the command to be quoted in \copy. Ie.
let's forbid this:

\copy pgbench_accounts to program /bin/cat>/dev/null

and require that it's written as:

\copy pgbench_accounts to program '/bin/cat>/dev/null'

We don't require quoting for filenames, which I find a bit weird too,
but it seems particularly confusing for a shell command.

Attached is a new version of this patch that I almost committed, but one
thing caught my eye at the last minute: The error reporting is not very
user-friendly. If the program fails after reading/writing some rows, the
reason is printed to the log, but not the user:

postgres=# copy foo to program '/tmp/midway-crash';
ERROR:  could not execute command "/tmp/midway-crash"

the log has more details:

LOG:  child process exited with exit code 10
STATEMENT:  copy foo to program '/tmp/midway-crash';
ERROR:  could not execute command "/tmp/midway-crash"
STATEMENT:  copy foo to program '/tmp/midway-crash';

I think we should arrange for the "child process exited with exit code
10" to be printed as errdetail to the client. Similarly, with psql \copy
command, the "child process exited with exit code 10" command shouldn't
be printed straight to stderr, but should go through psql_error.

I'll try to refactor pclose_check tomorrow to do that. Meanwhile, please
take a look at the attached if you have the time. I rewrote much of the
docs changes, and some comments.

- Heikki

Attachment
On Wednesday, February 27, 2013 12:41 AM Heikki Linnakangas wrote:
> On 22.02.2013 12:43, Etsuro Fujita wrote:
> >>>>>> 1. "Broken pipe" is not handled in case of psql "\copy" command;
> >>>>>>      Issue are as follows:
> >>>>>>          Following are verified on SuSE-Linux 10.2.
> >>>>>>          1) psql is exiting when "\COPY xxx TO" command is
> issued
> >>>>>> and
> >>>> command/script is not found
> >>>>>>                   When popen is called in write mode it is
> >>>>>> creating valid
> >>>> file descriptor and when it tries to write to file "Broken pipe"
> >>> error
> >>>> is>  coming which is not handled.
> >>>>>>                          psql# \copy pgbench_accounts TO PROGRAM
> >>>> '../compress.sh pgbench_accounts4.txt'
> >>>>>>          2) When "\copy" command is in progress then
> >>> program/command
> >>>>>> is
> >>>> killed/"crashed due to any problem"
> >>>>>>             psql is exiting.
> >>>>
> >>>>> This is a headache.  I have no idea how to solve this.
> >>>>
> >>>> I think we can keep it for committer to take a call on this issue.
> >>>
> >>> Agreed.
> 
> Fortunately this one is easy. We just need to ignore SIGPIPE, by
> calling pqsignal(SIGPIPE, SIG_IGN) before popen(). We already do the
> same in psql when the query output is piped to a pager, in PageOutput.

So are you planning to call the same in do_copy as well; in current patch
it is not there.

> >>>> 5. Following in copy.sgml can be changed to make more meaningful
> as
> >>>> the first line looks little adhoc.
> >>>> +<para>
> >>>> +      The command that input comes from or that output goes to.
> >>>> +      The command for COPY FROM, which input comes from, must
> >>>> +write  its
> >>>> output
> >>>> +      to standard output.  The command for COPY TO, which output
> >>> goes
> >>>> + to,
> >>>> must
> >>>> +      read its input from standard input.
> >>>> +</para>
> >>>
> >>> I've struggled to make the document more meaningful.
> >>
> >> To be honest, I am not sure whether introducing pre, post processor
> >> terminology is right or not, But again I shall let committer decide
> >> about this point.
> >
> > Agreed.
> 
> I changed the terminology to use terms like "the command specified with
> PROGRAM", instead of talking about pre- and post-processsors.
> 
> >> I have one small another doubt that in function parse_slash_copy,
> you
> >> avoided expand tilde for program case, which I am not sure is the
> >> right thing or not.
> >
> > Sorry, I'm not sure that, too.  I'd like to leave this for
> committers.
> 
> That seems correct. The shell will do tilde expansion with popen(), we
> don't need to do it ourselves.
> 
> There's one oddity in the psql \copy syntax. This is actually present
> in earlier versions too, but I think we should fix it now that we
> extend the syntax:
> 
>   -- Writes to standard output. There's no way to write to a file
> called "stdout".
>   \copy foo to 'stdout'
> 
> I think we should change the interpretation of the above so that it
> writes to a file called "stdout". It's possible that there's an
> application out there that relies on that to write to stdout, but it's
> not hard to fix the application. A small note in the release notes
> might be in order.
> 
> Also, I think we should require the command to be quoted in \copy. Ie.
> let's forbid this:
> 
> \copy pgbench_accounts to program /bin/cat>/dev/null
> 
> and require that it's written as:
> 
> \copy pgbench_accounts to program '/bin/cat>/dev/null'
> 
> We don't require quoting for filenames, which I find a bit weird too,
> but it seems particularly confusing for a shell command.
> 
> Attached is a new version of this patch that I almost committed, but
> one thing caught my eye at the last minute: The error reporting is not
> very user-friendly. If the program fails after reading/writing some
> rows, the reason is printed to the log, but not the user:
> 
> postgres=# copy foo to program '/tmp/midway-crash';
> ERROR:  could not execute command "/tmp/midway-crash"
> 
> the log has more details:
> 
> LOG:  child process exited with exit code 10
> STATEMENT:  copy foo to program '/tmp/midway-crash';
> ERROR:  could not execute command "/tmp/midway-crash"
> STATEMENT:  copy foo to program '/tmp/midway-crash';
> 
> I think we should arrange for the "child process exited with exit code
> 10" to be printed as errdetail to the client. Similarly, with psql
> \copy command, the "child process exited with exit code 10" command
> shouldn't be printed straight to stderr, but should go through
> psql_error.
> 
> I'll try to refactor pclose_check tomorrow to do that. Meanwhile,
> please take a look at the attached if you have the time. I rewrote much
> of the docs changes, and some comments.

If there is semicolon at end, it takes it into account for creating
filename.
\copy foo to c:\data\foo.out;
This problem was fixed in previous patch, but I think handling of quotes has
changed this behavior.

With Regards,
Amit Kapila.




Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
"Etsuro Fujita"
Date:
Hi,

Thank you for the work!

> From: Amit Kapila [mailto:amit.kapila@huawei.com]

> On Wednesday, February 27, 2013 12:41 AM Heikki Linnakangas wrote:

> > There's one oddity in the psql \copy syntax. This is actually present
> > in earlier versions too, but I think we should fix it now that we
> > extend the syntax:
> >
> >   -- Writes to standard output. There's no way to write to a file
> > called "stdout".
> >   \copy foo to 'stdout'
> >
> > I think we should change the interpretation of the above so that it
> > writes to a file called "stdout". It's possible that there's an
> > application out there that relies on that to write to stdout, but it's
> > not hard to fix the application. A small note in the release notes
> > might be in order.
> >
> > Also, I think we should require the command to be quoted in \copy. Ie.
> > let's forbid this:
> >
> > \copy pgbench_accounts to program /bin/cat>/dev/null
> >
> > and require that it's written as:
> >
> > \copy pgbench_accounts to program '/bin/cat>/dev/null'
> >
> > We don't require quoting for filenames, which I find a bit weird too,
> > but it seems particularly confusing for a shell command.

Agreed.  ISTM that the comment on parse_slash_copy() needs to be changed:
 * table name can be double-quoted and can have a schema part. * column names can be double-quoted. * filename can be
single-quotedlike SQL literals. * command can be single-quoted like SQL literals.
 

The last line must be:
 * command must be single-quoted like SQL literals.

> > Attached is a new version of this patch that I almost committed, but
> > one thing caught my eye at the last minute: The error reporting is not
> > very user-friendly. If the program fails after reading/writing some
> > rows, the reason is printed to the log, but not the user:
> >
> > postgres=# copy foo to program '/tmp/midway-crash';
> > ERROR:  could not execute command "/tmp/midway-crash"
> >
> > the log has more details:
> >
> > LOG:  child process exited with exit code 10
> > STATEMENT:  copy foo to program '/tmp/midway-crash';
> > ERROR:  could not execute command "/tmp/midway-crash"
> > STATEMENT:  copy foo to program '/tmp/midway-crash';
> >
> > I think we should arrange for the "child process exited with exit code
> > 10" to be printed as errdetail to the client. Similarly, with psql
> > \copy command, the "child process exited with exit code 10" command
> > shouldn't be printed straight to stderr, but should go through
> > psql_error.
> >
> > I'll try to refactor pclose_check tomorrow to do that. Meanwhile,
> > please take a look at the attached if you have the time. I rewrote much
> > of the docs changes, and some comments.

Looks fine to me.

> If there is semicolon at end, it takes it into account for creating
> filename.
> \copy foo to c:\data\foo.out;
> This problem was fixed in previous patch, but I think handling of quotes has
> changed this behavior.

ISTM that the following part at parse_slash_copy() needs to be changed:
   /* { 'filename' | PROGRAM 'command' | STDIN | STDOUT | PSTDIN | PSTDOUT } */   token = strtokx(NULL, whitespace,
NULL,"'",                   0, false, false, pset.encoding);   if (!token)       goto error;
 

strtokx() in the above should be called in the following way:
   token = strtokx(NULL, whitespace, ";", "'",                   0, false, false, pset.encoding);

Thanks,

Best regards,
Etsuro Fujita




Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
Heikki Linnakangas
Date:
On 27.02.2013 08:32, Amit Kapila wrote:
> If there is semicolon at end, it takes it into account for creating
> filename.
> \copy foo to c:\data\foo.out;
> This problem was fixed in previous patch, but I think handling of quotes has
> changed this behavior.

This is existing behavior, that creates a file called "foo.out;" on 
previous psql versions as well. I agree it's almost certainly not what 
the user intended, but that's a separate patch.

- Heikki



Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy

From
Heikki Linnakangas
Date:
On 27.02.2013 11:38, Etsuro Fujita wrote:
> Agreed.  ISTM that the comment on parse_slash_copy() needs to be changed:
>
>    * table name can be double-quoted and can have a schema part.
>    * column names can be double-quoted.
>    * filename can be single-quoted like SQL literals.
>    * command can be single-quoted like SQL literals.
>
> The last line must be:
>
>    * command must be single-quoted like SQL literals.

Fixed that.

>>> Attached is a new version of this patch that I almost committed, but
>>> one thing caught my eye at the last minute: The error reporting is not
>>> very user-friendly. If the program fails after reading/writing some
>>> rows, the reason is printed to the log, but not the user:
>>>
>>> postgres=# copy foo to program '/tmp/midway-crash';
>>> ERROR:  could not execute command "/tmp/midway-crash"
>>>
>>> the log has more details:
>>>
>>> LOG:  child process exited with exit code 10
>>> STATEMENT:  copy foo to program '/tmp/midway-crash';
>>> ERROR:  could not execute command "/tmp/midway-crash"
>>> STATEMENT:  copy foo to program '/tmp/midway-crash';
>>>
>>> I think we should arrange for the "child process exited with exit code
>>> 10" to be printed as errdetail to the client. Similarly, with psql
>>> \copy command, the "child process exited with exit code 10" command
>>> shouldn't be printed straight to stderr, but should go through
>>> psql_error.
>>>
>>> I'll try to refactor pclose_check tomorrow to do that. Meanwhile,
>>> please take a look at the attached if you have the time. I rewrote much
>>> of the docs changes, and some comments.
>
> Looks fine to me.

Ok, committed with the changes to the error handling. I refactored the 
logic to construct a human-readable string from pclose_check() to a new 
function, and used that.

Thanks for the patch, and thanks Amit for the review!

- Heikki