Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY
Date
Msg-id CAHGQGwHer4czY8Jt0STCGJ5GBNGTTSNBwu-7+K=+U6uK8P8GPw@mail.gmail.com
Whole thread Raw
In response to Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY  ("Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY  (Simon Riggs <simon@2ndQuadrant.com>)
Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Wed, Nov 14, 2012 at 8:30 PM, Etsuro Fujita
<fujita.etsuro@lab.ntt.co.jp> wrote:
> I wrote:
>> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>>
>> > I wrote:
>> > > "Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp> writes:
>> > >> I have a question.  I think it would be also better to extend the syntax
>> > >> for the SQL COPY command in the same way, ie,
>> > >> COPY foo from '/home/pgsql/decompress.sh /home/pgsql/foo.csv.gz |' with
>> > >> format 'csv'
>> >
>> > > Yeah, sure --- that case is already superuser-only, so why not give it
>> > > the option of being a popen instead of just fopen.
>> >
>> > BTW, one thought that comes to mind is that such an operation is
>> > extremely likely to fail under environments such as SELinux.  That's
>> > not necessarily a reason not to do it, but we should be wary of
>> > promising that it will work everywhere.  Probably a documentation note
>> > about this would be enough.
>>
>> OK  I'll revise the patch.
>
> I've revised the patch.  In this version a user can specify hooks for pre- and
> post-processor executables for COPY and \copy in the follwoing way:
>
>     $ echo '/bin/gunzip -c $1' > decompress.sh
>     $ chmod +x decompress.sh
>
> In the case of the COPY command,
>
>     postgres=# COPY foo FROM '/home/pgsql/decompress.sh /home/pgsql/foo.csv.gz'
> WITH (format 'csv');
>
> Also, in the case of the \copy instruction,
>
>     postgres=# \copy foo from '/home/pgsql/decompress.sh /home/pgsql/foo.csv.gz'
> with (format 'csv')
>
> As shown in the example above, I've assumed that the syntax for this option for
> e.g., the COPY command is:
>
>     COPY table_name FROM 'progname filename' WITH ...
>     COPY table_name TO 'progname filename' WITH ...
>
> Here, progname for COPY IN is the user-supplied program that takes filename as
> its argument and that writes on standard output.

What about further extending the COPY IN syntax to the following?
   COPY table_name FROM 'progname [ option, ... ]' WITH ...

I'd just like to execute
   COPY vmstat_table FROM 'vmstat' WITH ...

> Also, prgoname for COPY OUT is
> the user-supplied program that reads standard input and writes to filename taken
> as its argument.  This makes simple the identification and verification of
> progname and filename.
>
> Todo:
>     * Documentation including documentation note about the limitation for
> environments such as SELinux mentioned by Tom.
>     * More test
>
> Any comments and suggestions are welcomed.

Isn't it dangerous to allow a user to execute external program in
server side via SQL?

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Further pg_upgrade analysis for many tables
Next
From: Amit kapila
Date:
Subject: Re: Proposal for Allow postgresql.conf values to be changed via SQL