Re: raw output from copy - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: raw output from copy
Date
Msg-id 55B5EEC3.9050506@iki.fi
Whole thread Raw
In response to Re: raw output from copy  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: raw output from copy
List pgsql-hackers
On 07/27/2015 06:55 AM, Craig Ringer wrote:
> On 7 July 2015 at 14:32, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> Hi
>>
>> previous patch was broken, and buggy
>>
>> Here is new version with fixed upload and more tests
>
> I routinely see people trying to use COPY ... FORMAT binary to export
> a single binary field (like an image, for example) and getting
> confused by the header PostgreSQL adds. Or using text-format COPY and
> struggling with the hex escaping. It's clearly something people have
> trouble with.
>
> It doesn't help that while lo_import and lo_export can read paths
> outside the datadir (and refuse to read from within it),
> pg_read_binary_file is superuser only and disallows absolute paths.
> There's no corresponding pg_write_binary_file. So users who want to
> import and export a single binary field tend to try to use COPY. We
> have functionality for large objects that has no equivalent for
> 'bytea'.
>
> I don't love the use of COPY for this, but it gets us support for
> arbitrary clients pretty easily. Otherwise it'd be server-side only
> via local filesystem access, or require special psql-specific
> functionality like we have for lo_import etc.

COPY seems like a strange interface for this. I can see the point that 
the syntax is almost there already, for both input and output. But even 
that's not quite there yet, we'd need the new RAW format. And as an 
input method, COPY is a bit awkward, because you cannot easily pass the 
file to a function, for example. I think this should be implemented in 
psql, along the lines of Andrew's original \bcopy patch.

There are a couple of related psql-features here actually, that would be 
useful on their own. The first is being able to send the query result to 
a file, for a single query only. You can currently do:

\o /tmp/foo
SELECT ...;
\o

But more often than not, when I try to do that, I forget to do the last 
\o, and run another query, and the output still goes to the file. So 
it'd be nice to have a \o option that only affects the next query. 
Something like:

\O /tmp/foo
SELECT ...;

The second feature needed is to write the output without any headers, 
row delimiters and such. Just the datum. And the third feature is to 
write it in binary. Perhaps something like:

\O /tmp/foo binary
SELECT blob FROM foo WHERE id = 10;

What about input? This is a whole new feature, but it would be nice to 
be able to pass the file contents as a query parameter. Something like:

\P /tmp/foo binary
INSERT INTO foo VALUES (?);


- Heikki




pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Sharing aggregate states between different aggregate functions
Next
From: Heikki Linnakangas
Date:
Subject: Re: False comment about speculative insertion