Thread: Review : Add hooks for pre- and post-processor executables for COPY and \copy
Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
Amit Kapila
Date:
<div class="WordSection1"><p class="MsoNormal"><tt><b><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Basicstuff:</span></b></tt><span style="font-size:10.0pt;font-family:"Arial","sans-serif""><br/><tt><span style="font-family:"Arial","sans-serif"">------------</span></tt><br/><tt><span style="font-family:"Arial","sans-serif""> - Rebase of Patch is required.</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> - Compiles cleanly without any errors/warnings</span></tt><br /><tt><spanstyle="font-family:"Arial","sans-serif""> - Regression tests pass.</span></tt><br /><br /><tt><b><spanstyle="font-family:"Arial","sans-serif"">What it does:</span></b></tt><br /><tt><span style="font-family:"Arial","sans-serif"">---------------------</span></tt><br/><tt><span style="font-family:"Arial","sans-serif""> This patch is useful when COPY command input/output are stored in compressionformat or in any command/script uses these output/input in any means; without generating intermediate temporaryfiles.</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> This feature can be used in serverside using "COPY statement" by administrator. Or can be used in psql internal "\copy" command by any user.</span></tt><br/><br /><br /><tt><b><span style="font-family:"Arial","sans-serif"">Code Review comments:</span></b></tt><br/><tt><span style="font-family:"Arial","sans-serif"">---------------------</span></tt><br /> <br/><tt><span style="font-family:"Arial","sans-serif"">1. Modify the comment in function header of: parse_slash_copy</span></tt>(needs to modify for new syntax)<br /><tt><span style="font-family:"Arial","sans-serif"">2. Commentsfor functions OpenPipeStream & ClosePipeStream are missing. <b> </b></span></tt></span><p class="MsoNormal"><tt><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif"">3. Any Script errors are not directlyvisible to user; If there problems in script no way to cleanup.</span></tt><span style="font-size:10.0pt;font-family:"Arial","sans-serif""><tt><span style="font-family:"Arial","sans-serif""> </span></tt></span><pclass="MsoNormal"><tt><span style="font-size:10.0pt;font-family:"Arial","sans-serif""> Shouldn’t this be mentioned in User Manual.</span></tt><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""><br /><br /><tt><b><span style="font-family:"Arial","sans-serif"">Testcase issues:</span></b></tt><br /><tt><span style="font-family:"Arial","sans-serif"">------------------</span></tt><br/><tt><span style="font-family:"Arial","sans-serif"">1."Broken pipe" is not handled in case of psql "\copy" command;</span></tt><br /><tt><spanstyle="font-family:"Arial","sans-serif""> Issue are as follows:</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> Following are verified on SuSE-Linux 10.2.</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> 1) psql is exiting when "\COPY xxx TO" command is issued and command/scriptis not found</span></tt> </span><p class="MsoNormal"><tt><span style="font-size:10.0pt;font-family:"Arial","sans-serif""> When popen is called in write mode it is creatingvalid file descriptor and when it tries to write to file "Broken pipe" error is coming which is not handled. </span></tt><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""><br /><tt><span style="font-family:"Arial","sans-serif""> psql# \copy pgbench_accounts TO PROGRAM '../compress.sh pgbench_accounts4.txt'</span></tt><br/><tt><span style="font-family:"Arial","sans-serif""> 2) When "\copy" commandis in progress then program/command is killed/"crashed due to any problem" </span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> psql is exiting.</span></tt><br /><br /><tt><b><span style="font-family:"Arial","sans-serif"">Scriptused in testcases:</span></b></tt><br /><tt><span style="font-family:"Arial","sans-serif"">------------------</span></tt><br/><tt><span style="font-family:"Arial","sans-serif"">1.compress.sh</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> echo 'cat > $1' > compress.sh</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> echo 'bzip2 -z $1' >> compress.sh</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> chmod +x compress.sh</span></tt><br /><br /><tt><span style="font-family:"Arial","sans-serif"">2.decompress.sh</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> echo 'bzip2 -d -c -k $*' > decompress.sh</span></tt><br /><tt><span style="font-family:"Arial","sans-serif""> chmod +x decompress.sh</span></tt><br /><br /></span><p class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Testcasesexecuted are attached with this mail.</span><p class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">WithRegards,</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">AmitKapila.</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></div>
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
"Etsuro Fujita"
Date:
<div class="WordSection1"><p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D">HiAmit,</span><p class="MsoNormal"><span lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D"> </span><p class="MsoNormal"><span lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D">Thank you for your review. I’ve rebasedand updated the patch. </span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Pleasefind attached the patch.</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D"></span><pclass="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""><br/><tt><b><span style="font-family:"Arial","sans-serif"">>Code Review comments:</span></b></tt><br /><tt><span style="font-family:"Arial","sans-serif"">>---------------------</span></tt><br />> <br /><tt><span style="font-family:"Arial","sans-serif"">>1. Modify the comment in function header of: parse_slash_copy</span></tt> (needsto modify for new syntax)<br /><br /><tt><span style="font-family:"Arial","sans-serif""></span></tt></span><p class="MsoNormal"><tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Done.</span></tt><p class="MsoNormal"><tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></tt><p class="MsoNormal"><tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">> 2. Comments for functionsOpenPipeStream & ClosePipeStream are missing. <b> </b></span></tt><p class="MsoNormal"><tt><span lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></tt><p class="MsoNormal"><tt><span lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif"">Done.</span></tt><p class="MsoNormal"><tt><span lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span></tt><p class="MsoNormal"><tt><span lang="EN-US"style="font-size:10.0pt;font-family:"Arial","sans-serif"">> 3. Any Script errors are not directly visibleto user; If there problems in script no way to cleanup.</span></tt><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""><tt><span style="font-family:"Arial","sans-serif""> </span></tt></span><pclass="MsoNormal"><tt><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">> Shouldn’t this be mentioned in User Manual.</span></tt><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif";color:#1F497D"></span><p class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Done. Please see the documentationnote on the \copy instruction in psql-ref.sgml.</span><p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""><br/><tt><b><span style="font-family:"Arial","sans-serif"">>Test case issues:</span></b></tt><br /><tt><span style="font-family:"Arial","sans-serif"">>------------------</span></tt><br /><tt><span style="font-family:"Arial","sans-serif"">>1. "Broken pipe" is not handled in case of psql "\copy" command;</span></tt><br/><tt><span style="font-family:"Arial","sans-serif"">> Issue are as follows:</span></tt><br/><tt><span style="font-family:"Arial","sans-serif"">> Following are verified on SuSE-Linux10.2.</span></tt><br /><tt><span style="font-family:"Arial","sans-serif"">> 1) psql is exiting when"\COPY xxx TO" command is issued and command/script is not found</span></tt> </span><span lang="EN-US"></span><p class="MsoNormal"style="margin-bottom:12.0pt"><tt><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">> When popen is called in write mode it is creatingvalid file descriptor and when it tries to write to file "Broken pipe" error is > coming which is not handled.</span></tt><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""><br /><tt><span style="font-family:"Arial","sans-serif"">> psql# \copy pgbench_accounts TO PROGRAM '../compress.shpgbench_accounts4.txt'</span></tt><br /><tt><span style="font-family:"Arial","sans-serif"">> 2)When "\copy" command is in progress then program/command is killed/"crashed due to any problem" </span></tt><br /><tt><spanstyle="font-family:"Arial","sans-serif"">> psql is exiting.</span></tt></span><p class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">This is a headache. I haveno idea how to solve this.</span><p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Sorryfor the long delay in responding.</span><p class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Best regards,</span><p class="MsoNormal"><spanlang="EN-US" style="font-size:10.0pt;font-family:"Arial","sans-serif"">Etsuro Fujita</span></div>
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
Amit Kapila
Date:
On Wednesday, January 23, 2013 5:36 PM Etsuro Fujita wrote: > Hi Amit, > Thank you for your review. Ive rebased and updated the patch. Please find attached the patch. >> Test case issues: >> ------------------ >> 1. "Broken pipe" is not handled in case of psql "\copy" command; >> Issue are as follows: >> Following are verified on SuSE-Linux 10.2. >> 1) psql is exiting when "\COPY xxx TO" command is issued and command/script is not found >> When popen is called in write mode it is creating valid file descriptor and when it tries to write to file "Broken pipe" error is > coming which is not handled. >> psql# \copy pgbench_accounts TO PROGRAM '../compress.sh pgbench_accounts4.txt' >> 2) When "\copy" command is in progress then program/command is killed/"crashed due to any problem" >> psql is exiting. >This is a headache. I have no idea how to solve this. I think we can keep it for committer to take a call on this issue. The other changes done by you in revised patch are fine. I have found few more minor issues as below: 1. The comment above do_copy can be modified to address the new functionality it can handle. /* * Execute a \copy command (frontend copy). We have to open a file, then * submit a COPY query to the backend and eitherfeed it data from the * file or route its response into the file. */ bool do_copy(const char *args) 2. @@ -256,8 +273,14 @@ do_copy(const char *args) + if (options->file == NULL && options->program) + { + psql_error("program is not supported to stdout/pstdout or from stdin/pstdin\n"); + return false; + } should call free_copy_options(options); before return false; 3. \copy command doesn't need semicolon at end, however it was working previous to your patch, but now it is giving error. postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt'; e:/pg_git_code/Data/t1_Data.txt';: No such file or directory e:/pg_git_code/Data/t1_Data.txt';: No such file or directory 4. Please check if OpenPipeStream() it needs to call if (ReleaseLruFile()), 5. Following in copy.sgml can be changed to make more meaningful as the first line looks little adhoc. + <para> + The command that input comes from or that output goes to. + The command for COPY FROM, which input comes from, must write its output + to standard output. The command for COPY TO, which output goes to, must + read its input from standard input. + </para> 6. Can we have one example of this new syntax, it can make it more meaningful. With Regards, Amit Kapila.
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
"Etsuro Fujita"
Date:
Hi Amit, Thank you for the review. > From: Amit Kapila [mailto:amit.kapila@huawei.com] > >> Test case issues: > >> ------------------ > >> 1. "Broken pipe" is not handled in case of psql "\copy" command; > >> Issue are as follows: > >> Following are verified on SuSE-Linux 10.2. > >> 1) psql is exiting when "\COPY xxx TO" command is issued and > command/script is not found > >> When popen is called in write mode it is creating valid > file descriptor and when it tries to write to file "Broken pipe" error is > > coming which is not handled. > >> psql# \copy pgbench_accounts TO PROGRAM > '../compress.sh pgbench_accounts4.txt' > >> 2) When "\copy" command is in progress then program/command is > killed/"crashed due to any problem" > >> psql is exiting. > > >This is a headache. I have no idea how to solve this. > > I think we can keep it for committer to take a call on this issue. Agreed. > I have found few more minor issues as below: > > 1. The comment above do_copy can be modified to address the new > functionality it can handle. > /* > * Execute a \copy command (frontend copy). We have to open a file, then > * submit a COPY query to the backend and either feed it data from the > * file or route its response into the file. > */ > bool > do_copy(const char *args) Done. > 2. > @@ -256,8 +273,14 @@ do_copy(const char *args) > + if (options->file == NULL && options->program) > + { > + psql_error("program is not supported to stdout/pstdout or > from stdin/pstdin\n"); > + return false; > + } > > should call free_copy_options(options); before return false; Good catch! Done. > 3. \copy command doesn't need semicolon at end, however it was working > previous to your patch, but > now it is giving error. > postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt'; > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory Sorry, I've fixed the bug. > 4. Please check if OpenPipeStream() it needs to call > if (ReleaseLruFile()), OpenPipeStream() calls ReleaseLruFile() by itself if necessary. > 5. Following in copy.sgml can be changed to make more meaningful as the > first line looks little adhoc. > + <para> > + The command that input comes from or that output goes to. > + The command for COPY FROM, which input comes from, must write its > output > + to standard output. The command for COPY TO, which output goes to, > must > + read its input from standard input. > + </para> I've struggled to make the document more meaningful. > 6. Can we have one example of this new syntax, it can make it more > meaningful. Done. Sorry for the long delay. Best regards, Etsuro Fujita > With Regards, > Amit Kapila. > >
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
"Etsuro Fujita"
Date:
Hi Amit, Thank you for your careful review! > -----Original Message----- > From: Amit Kapila [mailto:amit.kapila@huawei.com] > Sent: Friday, February 22, 2013 7:18 PM > To: 'Etsuro Fujita'; 'pgsql-hackers' > Subject: RE: [HACKERS] Review : Add hooks for pre- and post-processor > executables for COPY and \copy > > On Wednesday, February 20, 2013 5:25 PM Etsuro Fujita wrote: > > Hi Amit, > > > > Thank you for the review. > > Etsuro-san, you are welcome. > > > > From: Amit Kapila [mailto:amit.kapila@huawei.com] > > > > > >> Test case issues: > > > >> ------------------ > > > >> 1. "Broken pipe" is not handled in case of psql "\copy" command; > > > >> Issue are as follows: > > > >> Following are verified on SuSE-Linux 10.2. > > > >> 1) psql is exiting when "\COPY xxx TO" command is issued > > > >> and > > > command/script is not found > > > >> When popen is called in write mode it is creating > > > >>valid > > > file descriptor and when it tries to write to file "Broken pipe" > > error > > > is > coming which is not handled. > > > >> psql# \copy pgbench_accounts TO PROGRAM > > > '../compress.sh pgbench_accounts4.txt' > > > >> 2) When "\copy" command is in progress then > > program/command > > > >> is > > > killed/"crashed due to any problem" > > > >> psql is exiting. > > > > > > >This is a headache. I have no idea how to solve this. > > > > > > I think we can keep it for committer to take a call on this issue. > > > > Agreed. > > > > > I have found few more minor issues as below: > > > > > > 1. The comment above do_copy can be modified to address the new > > > functionality it can handle. > > > /* > > > * Execute a \copy command (frontend copy). We have to open a file, > > > then > > > * submit a COPY query to the backend and either feed it data from > > the > > > * file or route its response into the file. > > > */ > > > bool > > > do_copy(const char *args) > > > > Done. > > > > > 2. > > > @@ -256,8 +273,14 @@ do_copy(const char *args) > > > + if (options->file == NULL && options->program) > > > + { > > > + psql_error("program is not supported to > > > + stdout/pstdout or > > > from stdin/pstdin\n"); > > > + return false; > > > + } > > > > > > should call free_copy_options(options); before return false; > > > > Good catch! Done. > > > > > 3. \copy command doesn't need semicolon at end, however it was > > working > > > previous to your patch, but > > > now it is giving error. > > > postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt'; > > > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory > > > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory > > > > Sorry, I've fixed the bug. > > > > > 4. Please check if OpenPipeStream() it needs to call > > > if (ReleaseLruFile()), > > > > OpenPipeStream() calls ReleaseLruFile() by itself if necessary. > > I have asked this thinking that ReleaseLruFile() may not be useful for > OpenPipeStream, > As I was not sure how the new file descriptors get allocated for popen. > But now again reading popen specs, I got the point that it can be useful. > > > > 5. Following in copy.sgml can be changed to make more meaningful as > > > the first line looks little adhoc. > > > + <para> > > > + The command that input comes from or that output goes to. > > > + The command for COPY FROM, which input comes from, must write > > > + its > > > output > > > + to standard output. The command for COPY TO, which output > > goes > > > + to, > > > must > > > + read its input from standard input. > > > + </para> > > > > I've struggled to make the document more meaningful. > > To be honest, I am not sure whether introducing pre, post processor > terminology is right or not, > But again I shall let committer decide about this point. Agreed. > > > 6. Can we have one example of this new syntax, it can make it more > > > meaningful. > > > > Done. > > > > Sorry for the long delay. > > All the reported issues are handled in the new patch. > > I have one small another doubt that in function parse_slash_copy, you > avoided expand tilde > for program case, which I am not sure is the right thing or not. Sorry, I'm not sure that, too. I'd like to leave this for committers. > I am marking this patch as Ready For Committer. Thanks! Best regards, Etsuro Fujita > Notes For Committer > ----------------------- > 1. "Broken pipe" is not handled in case of psql "\copy" command; > This is currently documented > 2. Documentation needs to be checked, especially with focus whether > introducing pre, post processor terminology is > Okay. > 3. In function parse_slash_copy, expand tilde is avaoided, is it okay? > > > With Regards, > Amit Kapila.
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
Amit Kapila
Date:
On Wednesday, February 20, 2013 5:25 PM Etsuro Fujita wrote: > Hi Amit, > > Thank you for the review. Etsuro-san, you are welcome. > > From: Amit Kapila [mailto:amit.kapila@huawei.com] > > > >> Test case issues: > > >> ------------------ > > >> 1. "Broken pipe" is not handled in case of psql "\copy" command; > > >> Issue are as follows: > > >> Following are verified on SuSE-Linux 10.2. > > >> 1) psql is exiting when "\COPY xxx TO" command is issued > > >> and > > command/script is not found > > >> When popen is called in write mode it is creating > > >>valid > > file descriptor and when it tries to write to file "Broken pipe" > error > > is > coming which is not handled. > > >> psql# \copy pgbench_accounts TO PROGRAM > > '../compress.sh pgbench_accounts4.txt' > > >> 2) When "\copy" command is in progress then > program/command > > >> is > > killed/"crashed due to any problem" > > >> psql is exiting. > > > > >This is a headache. I have no idea how to solve this. > > > > I think we can keep it for committer to take a call on this issue. > > Agreed. > > > I have found few more minor issues as below: > > > > 1. The comment above do_copy can be modified to address the new > > functionality it can handle. > > /* > > * Execute a \copy command (frontend copy). We have to open a file, > > then > > * submit a COPY query to the backend and either feed it data from > the > > * file or route its response into the file. > > */ > > bool > > do_copy(const char *args) > > Done. > > > 2. > > @@ -256,8 +273,14 @@ do_copy(const char *args) > > + if (options->file == NULL && options->program) > > + { > > + psql_error("program is not supported to > > + stdout/pstdout or > > from stdin/pstdin\n"); > > + return false; > > + } > > > > should call free_copy_options(options); before return false; > > Good catch! Done. > > > 3. \copy command doesn't need semicolon at end, however it was > working > > previous to your patch, but > > now it is giving error. > > postgres=# \copy t1 from 'e:\pg_git_code\Data\t1_Data.txt'; > > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory > > e:/pg_git_code/Data/t1_Data.txt';: No such file or directory > > Sorry, I've fixed the bug. > > > 4. Please check if OpenPipeStream() it needs to call > > if (ReleaseLruFile()), > > OpenPipeStream() calls ReleaseLruFile() by itself if necessary. I have asked this thinking that ReleaseLruFile() may not be useful for OpenPipeStream, As I was not sure how the new file descriptors get allocated for popen. But now again reading popen specs, I got the point that it can be useful. > > 5. Following in copy.sgml can be changed to make more meaningful as > > the first line looks little adhoc. > > + <para> > > + The command that input comes from or that output goes to. > > + The command for COPY FROM, which input comes from, must write > > + its > > output > > + to standard output. The command for COPY TO, which output > goes > > + to, > > must > > + read its input from standard input. > > + </para> > > I've struggled to make the document more meaningful. To be honest, I am not sure whether introducing pre, post processor terminology is right or not, But again I shall let committer decide about this point. > > 6. Can we have one example of this new syntax, it can make it more > > meaningful. > > Done. > > Sorry for the long delay. All the reported issues are handled in the new patch. I have one small another doubt that in function parse_slash_copy, you avoided expand tilde for program case, which I am not sure is the right thing or not. I am marking this patch as Ready For Committer. Notes For Committer ----------------------- 1. "Broken pipe" is not handled in case of psql "\copy" command; This is currently documented 2. Documentation needs to be checked, especially with focus whether introducing pre, post processor terminology is Okay. 3. In function parse_slash_copy, expand tilde is avaoided, is it okay? With Regards, Amit Kapila.
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
Heikki Linnakangas
Date:
On 22.02.2013 12:43, Etsuro Fujita wrote: >>>>>> 1. "Broken pipe" is not handled in case of psql "\copy" command; >>>>>> Issue are as follows: >>>>>> Following are verified on SuSE-Linux 10.2. >>>>>> 1) psql is exiting when "\COPY xxx TO" command is issued >>>>>> and >>>> command/script is not found >>>>>> When popen is called in write mode it is creating >>>>>> valid >>>> file descriptor and when it tries to write to file "Broken pipe" >>> error >>>> is> coming which is not handled. >>>>>> psql# \copy pgbench_accounts TO PROGRAM >>>> '../compress.sh pgbench_accounts4.txt' >>>>>> 2) When "\copy" command is in progress then >>> program/command >>>>>> is >>>> killed/"crashed due to any problem" >>>>>> psql is exiting. >>>> >>>>> This is a headache. I have no idea how to solve this. >>>> >>>> I think we can keep it for committer to take a call on this issue. >>> >>> Agreed. Fortunately this one is easy. We just need to ignore SIGPIPE, by calling pqsignal(SIGPIPE, SIG_IGN) before popen(). We already do the same in psql when the query output is piped to a pager, in PageOutput. >>>> 5. Following in copy.sgml can be changed to make more meaningful as >>>> the first line looks little adhoc. >>>> +<para> >>>> + The command that input comes from or that output goes to. >>>> + The command for COPY FROM, which input comes from, must write >>>> + its >>>> output >>>> + to standard output. The command for COPY TO, which output >>> goes >>>> + to, >>>> must >>>> + read its input from standard input. >>>> +</para> >>> >>> I've struggled to make the document more meaningful. >> >> To be honest, I am not sure whether introducing pre, post processor >> terminology is right or not, >> But again I shall let committer decide about this point. > > Agreed. I changed the terminology to use terms like "the command specified with PROGRAM", instead of talking about pre- and post-processsors. >> I have one small another doubt that in function parse_slash_copy, you >> avoided expand tilde >> for program case, which I am not sure is the right thing or not. > > Sorry, I'm not sure that, too. I'd like to leave this for committers. That seems correct. The shell will do tilde expansion with popen(), we don't need to do it ourselves. There's one oddity in the psql \copy syntax. This is actually present in earlier versions too, but I think we should fix it now that we extend the syntax: -- Writes to standard output. There's no way to write to a file called "stdout". \copy foo to 'stdout' I think we should change the interpretation of the above so that it writes to a file called "stdout". It's possible that there's an application out there that relies on that to write to stdout, but it's not hard to fix the application. A small note in the release notes might be in order. Also, I think we should require the command to be quoted in \copy. Ie. let's forbid this: \copy pgbench_accounts to program /bin/cat>/dev/null and require that it's written as: \copy pgbench_accounts to program '/bin/cat>/dev/null' We don't require quoting for filenames, which I find a bit weird too, but it seems particularly confusing for a shell command. Attached is a new version of this patch that I almost committed, but one thing caught my eye at the last minute: The error reporting is not very user-friendly. If the program fails after reading/writing some rows, the reason is printed to the log, but not the user: postgres=# copy foo to program '/tmp/midway-crash'; ERROR: could not execute command "/tmp/midway-crash" the log has more details: LOG: child process exited with exit code 10 STATEMENT: copy foo to program '/tmp/midway-crash'; ERROR: could not execute command "/tmp/midway-crash" STATEMENT: copy foo to program '/tmp/midway-crash'; I think we should arrange for the "child process exited with exit code 10" to be printed as errdetail to the client. Similarly, with psql \copy command, the "child process exited with exit code 10" command shouldn't be printed straight to stderr, but should go through psql_error. I'll try to refactor pclose_check tomorrow to do that. Meanwhile, please take a look at the attached if you have the time. I rewrote much of the docs changes, and some comments. - Heikki
Attachment
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
Amit Kapila
Date:
On Wednesday, February 27, 2013 12:41 AM Heikki Linnakangas wrote: > On 22.02.2013 12:43, Etsuro Fujita wrote: > >>>>>> 1. "Broken pipe" is not handled in case of psql "\copy" command; > >>>>>> Issue are as follows: > >>>>>> Following are verified on SuSE-Linux 10.2. > >>>>>> 1) psql is exiting when "\COPY xxx TO" command is > issued > >>>>>> and > >>>> command/script is not found > >>>>>> When popen is called in write mode it is > >>>>>> creating valid > >>>> file descriptor and when it tries to write to file "Broken pipe" > >>> error > >>>> is> coming which is not handled. > >>>>>> psql# \copy pgbench_accounts TO PROGRAM > >>>> '../compress.sh pgbench_accounts4.txt' > >>>>>> 2) When "\copy" command is in progress then > >>> program/command > >>>>>> is > >>>> killed/"crashed due to any problem" > >>>>>> psql is exiting. > >>>> > >>>>> This is a headache. I have no idea how to solve this. > >>>> > >>>> I think we can keep it for committer to take a call on this issue. > >>> > >>> Agreed. > > Fortunately this one is easy. We just need to ignore SIGPIPE, by > calling pqsignal(SIGPIPE, SIG_IGN) before popen(). We already do the > same in psql when the query output is piped to a pager, in PageOutput. So are you planning to call the same in do_copy as well; in current patch it is not there. > >>>> 5. Following in copy.sgml can be changed to make more meaningful > as > >>>> the first line looks little adhoc. > >>>> +<para> > >>>> + The command that input comes from or that output goes to. > >>>> + The command for COPY FROM, which input comes from, must > >>>> +write its > >>>> output > >>>> + to standard output. The command for COPY TO, which output > >>> goes > >>>> + to, > >>>> must > >>>> + read its input from standard input. > >>>> +</para> > >>> > >>> I've struggled to make the document more meaningful. > >> > >> To be honest, I am not sure whether introducing pre, post processor > >> terminology is right or not, But again I shall let committer decide > >> about this point. > > > > Agreed. > > I changed the terminology to use terms like "the command specified with > PROGRAM", instead of talking about pre- and post-processsors. > > >> I have one small another doubt that in function parse_slash_copy, > you > >> avoided expand tilde for program case, which I am not sure is the > >> right thing or not. > > > > Sorry, I'm not sure that, too. I'd like to leave this for > committers. > > That seems correct. The shell will do tilde expansion with popen(), we > don't need to do it ourselves. > > There's one oddity in the psql \copy syntax. This is actually present > in earlier versions too, but I think we should fix it now that we > extend the syntax: > > -- Writes to standard output. There's no way to write to a file > called "stdout". > \copy foo to 'stdout' > > I think we should change the interpretation of the above so that it > writes to a file called "stdout". It's possible that there's an > application out there that relies on that to write to stdout, but it's > not hard to fix the application. A small note in the release notes > might be in order. > > Also, I think we should require the command to be quoted in \copy. Ie. > let's forbid this: > > \copy pgbench_accounts to program /bin/cat>/dev/null > > and require that it's written as: > > \copy pgbench_accounts to program '/bin/cat>/dev/null' > > We don't require quoting for filenames, which I find a bit weird too, > but it seems particularly confusing for a shell command. > > Attached is a new version of this patch that I almost committed, but > one thing caught my eye at the last minute: The error reporting is not > very user-friendly. If the program fails after reading/writing some > rows, the reason is printed to the log, but not the user: > > postgres=# copy foo to program '/tmp/midway-crash'; > ERROR: could not execute command "/tmp/midway-crash" > > the log has more details: > > LOG: child process exited with exit code 10 > STATEMENT: copy foo to program '/tmp/midway-crash'; > ERROR: could not execute command "/tmp/midway-crash" > STATEMENT: copy foo to program '/tmp/midway-crash'; > > I think we should arrange for the "child process exited with exit code > 10" to be printed as errdetail to the client. Similarly, with psql > \copy command, the "child process exited with exit code 10" command > shouldn't be printed straight to stderr, but should go through > psql_error. > > I'll try to refactor pclose_check tomorrow to do that. Meanwhile, > please take a look at the attached if you have the time. I rewrote much > of the docs changes, and some comments. If there is semicolon at end, it takes it into account for creating filename. \copy foo to c:\data\foo.out; This problem was fixed in previous patch, but I think handling of quotes has changed this behavior. With Regards, Amit Kapila.
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
"Etsuro Fujita"
Date:
Hi, Thank you for the work! > From: Amit Kapila [mailto:amit.kapila@huawei.com] > On Wednesday, February 27, 2013 12:41 AM Heikki Linnakangas wrote: > > There's one oddity in the psql \copy syntax. This is actually present > > in earlier versions too, but I think we should fix it now that we > > extend the syntax: > > > > -- Writes to standard output. There's no way to write to a file > > called "stdout". > > \copy foo to 'stdout' > > > > I think we should change the interpretation of the above so that it > > writes to a file called "stdout". It's possible that there's an > > application out there that relies on that to write to stdout, but it's > > not hard to fix the application. A small note in the release notes > > might be in order. > > > > Also, I think we should require the command to be quoted in \copy. Ie. > > let's forbid this: > > > > \copy pgbench_accounts to program /bin/cat>/dev/null > > > > and require that it's written as: > > > > \copy pgbench_accounts to program '/bin/cat>/dev/null' > > > > We don't require quoting for filenames, which I find a bit weird too, > > but it seems particularly confusing for a shell command. Agreed. ISTM that the comment on parse_slash_copy() needs to be changed: * table name can be double-quoted and can have a schema part. * column names can be double-quoted. * filename can be single-quotedlike SQL literals. * command can be single-quoted like SQL literals. The last line must be: * command must be single-quoted like SQL literals. > > Attached is a new version of this patch that I almost committed, but > > one thing caught my eye at the last minute: The error reporting is not > > very user-friendly. If the program fails after reading/writing some > > rows, the reason is printed to the log, but not the user: > > > > postgres=# copy foo to program '/tmp/midway-crash'; > > ERROR: could not execute command "/tmp/midway-crash" > > > > the log has more details: > > > > LOG: child process exited with exit code 10 > > STATEMENT: copy foo to program '/tmp/midway-crash'; > > ERROR: could not execute command "/tmp/midway-crash" > > STATEMENT: copy foo to program '/tmp/midway-crash'; > > > > I think we should arrange for the "child process exited with exit code > > 10" to be printed as errdetail to the client. Similarly, with psql > > \copy command, the "child process exited with exit code 10" command > > shouldn't be printed straight to stderr, but should go through > > psql_error. > > > > I'll try to refactor pclose_check tomorrow to do that. Meanwhile, > > please take a look at the attached if you have the time. I rewrote much > > of the docs changes, and some comments. Looks fine to me. > If there is semicolon at end, it takes it into account for creating > filename. > \copy foo to c:\data\foo.out; > This problem was fixed in previous patch, but I think handling of quotes has > changed this behavior. ISTM that the following part at parse_slash_copy() needs to be changed: /* { 'filename' | PROGRAM 'command' | STDIN | STDOUT | PSTDIN | PSTDOUT } */ token = strtokx(NULL, whitespace, NULL,"'", 0, false, false, pset.encoding); if (!token) goto error; strtokx() in the above should be called in the following way: token = strtokx(NULL, whitespace, ";", "'", 0, false, false, pset.encoding); Thanks, Best regards, Etsuro Fujita
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
Heikki Linnakangas
Date:
On 27.02.2013 08:32, Amit Kapila wrote: > If there is semicolon at end, it takes it into account for creating > filename. > \copy foo to c:\data\foo.out; > This problem was fixed in previous patch, but I think handling of quotes has > changed this behavior. This is existing behavior, that creates a file called "foo.out;" on previous psql versions as well. I agree it's almost certainly not what the user intended, but that's a separate patch. - Heikki
Re: Review : Add hooks for pre- and post-processor executables for COPY and \copy
From
Heikki Linnakangas
Date:
On 27.02.2013 11:38, Etsuro Fujita wrote: > Agreed. ISTM that the comment on parse_slash_copy() needs to be changed: > > * table name can be double-quoted and can have a schema part. > * column names can be double-quoted. > * filename can be single-quoted like SQL literals. > * command can be single-quoted like SQL literals. > > The last line must be: > > * command must be single-quoted like SQL literals. Fixed that. >>> Attached is a new version of this patch that I almost committed, but >>> one thing caught my eye at the last minute: The error reporting is not >>> very user-friendly. If the program fails after reading/writing some >>> rows, the reason is printed to the log, but not the user: >>> >>> postgres=# copy foo to program '/tmp/midway-crash'; >>> ERROR: could not execute command "/tmp/midway-crash" >>> >>> the log has more details: >>> >>> LOG: child process exited with exit code 10 >>> STATEMENT: copy foo to program '/tmp/midway-crash'; >>> ERROR: could not execute command "/tmp/midway-crash" >>> STATEMENT: copy foo to program '/tmp/midway-crash'; >>> >>> I think we should arrange for the "child process exited with exit code >>> 10" to be printed as errdetail to the client. Similarly, with psql >>> \copy command, the "child process exited with exit code 10" command >>> shouldn't be printed straight to stderr, but should go through >>> psql_error. >>> >>> I'll try to refactor pclose_check tomorrow to do that. Meanwhile, >>> please take a look at the attached if you have the time. I rewrote much >>> of the docs changes, and some comments. > > Looks fine to me. Ok, committed with the changes to the error handling. I refactored the logic to construct a human-readable string from pclose_check() to a new function, and used that. Thanks for the patch, and thanks Amit for the review! - Heikki