Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY
Date
Msg-id 50A45E05.3030201@2ndQuadrant.com
Whole thread Raw
In response to Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: WIP patch: add (PRE|POST)PROCESSOR options to COPY  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 11/15/2012 10:19 AM, Tom Lane wrote:
>
> I disagree very very strongly with that.  If we prevent use of shell
> syntax, we will lose a lot of functionality, for instance
>
>     copy ... from program 'foo <somefile'
>     copy ... from program 'foo | bar'
>
> unless you're imagining that we will reimplement a whole lot of that
> same shell syntax for ourselves.  (And no, I don't care whether the
> Windows syntax is exactly the same or not.  The program name/path is
> already likely to vary across systems, so it's pointless to suppose that
> use of the feature would be 100% portable if only we lobotomized it.)

That's reasonable - and it isn't worth making people jump through hoops
with ('bash','-c','/some/command < infile') .

> So?  You're already handing the keys to the kingdom to anybody who can
> control the contents of that command line, even if it's only to point at
> the wrong program.  And one man's "unexpected side-effect" is another
> man's "essential feature", as in my examples above.

That's true if they're controlling the whole command, not so much if
they just provide a file name. I'm just worried that people will use it
without thinking deeply about the consequences, just like they do with
untrusted client input in SQL injection attacks.

I take you point about wanting more than just the execve()-style
invocation. I'd still like to see a way to invoke the command without
having the shell involved, though; APIs to invoke external programs seem
to start out with a version that launches via the shell then quickly
grow more controlled argument-vector versions.

There's certainly room for a quick'n'easy COPY ... FROM PROGRAM ('cmd1 |
cmd2 | tee /tmp/log') . At this point all I think is really vital is to
make copy-with-exec *syntactically different* to plain COPY, and to
leave room for extending the syntax for environment, separate args
vector, etc when they're called for. Like VACUUM, where VACUUM VERBOSE
ANALYZE became VACUUM (VERBOSE, ANALYZE) to make room for (BUFFERS), etc.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services




pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Materialized views WIP patch
Next
From: Peter Eisentraut
Date:
Subject: Re: recursive view syntax