On 04/01/2016 11:42 AM, Daniel Verite wrote:
> Andrew Dunstan wrote:
>
>> If someone can make a good case that this is going to be of
>> general use I'll happily go along, but I haven't seen one so far.
> About COPY FROM with a raw format, for instance just yesterday
> there was this user question on stackoverflow:
> http://stackoverflow.com/questions/36317237
>
> which essentially is: how to import contents from a file without any
> particular interpretation of any character?\
There is so much wrong with this it's hard to know where to start.
Inserting the whole contents of a text file unchanged is insanely easy
in psql.
\set file `cat /path/to/file` insert into mytable(contents) values(:'file');
What is more everyone on SO missed the fact that CSV mode gives you very
considerable control over the quote, delimiter and null settings.
See for example
<http://adpgtech.blogspot.com/2014/09/importing-json-data.html> which
has this example for handling files consisting of 1 json document per line:
copy the_table(jsonfield)from '/path/to/jsondata'csv quote e'\x01' delimiter e'\x02';
psql's \copy will work just the same way
(I noticed with amusement this week that CitusData is using pretty much
exactly this in one of their examples.)
>
> With the patch discussed in this thread, a user can do
> \copy table(textcol) from /path/to/file (format raw)
> or the equivalent COPY.
> If it's a binary column, that works just the same.
It would be fairly simple to invent a binary mechanism that did the
equivalent of the above insert. All without any change to SQL or the
backend at all.
>
> Without this, it's not obvious at all how this result can be
> achieved without resorting to external preprocessing,
> and assuming the availability of such preprocessing tools
> in the environment. Notwithstanding the fact that the
> solution proposed on SO (doubling backslashes with sed)
> doesn't even work if the file contains tabs, as they would be
> interpreted as field separators, even if the copy target has only
> one column. You can change the delimiter with COPY but AFAIK
> you can't tell that there is none.
There is arguably a good case for allowing a null delimiter. But that SO
page is just a terrible piece of misinformation, as far too often
happens in my experience.
And I am still waiting for a non-psql use case. But I don't expect to
see one, precisely because most clients have no difficulty at all in
handling binary data.
cheers
andrew