On Thu, May 18, 2023, at 18:48, Daniel Verite wrote:
> Joel Jacobson wrote:
>> OTOH, one would then need to inspect the TSV file doesn't contain \. on an
>> empty line...
>
> Note that this is the case for valid CSV contents, since backslash-dot
> on a line by itself is both an end-of-data marker for COPY FROM and a
> valid CSV line.
> Having this line in the data results in either an error or having the
> rest of the data silently discarded, depending on the context. There
> is some previous discussion about this in [1].
> Since the TEXT format doesn't have this kind of problem, one solution
> is to filter the data through PROGRAM with an [untrusted CSV]->TEXT
> filter. This is to be preferred over direct CSV loading when
> strictness or robustness are more important than convenience.
>
>
> [1]
> https://www.postgresql.org/message-id/10e3eff6-eb04-4b3f-aeb4-b920192b977a@manitou-mail.org
Thanks for sharing the old thread, very useful.
I see I've failed miserably to understand all the details of the COPY command.
Upon reading the thread, I'm still puzzled about one thing:
Why does \. need to have a special meaning when using COPY FROM with files?
I understand its necessity for STDIN, given that the end of input needs to be
explicitly defined.
However, for files, we have a known file size and the end-of-file can be
detected without the need for special markers.
Also, is the difference in how server-side COPY CSV is capable of dealing
with \. but apparently not the client-side \COPY CSV documented somewhere?
CREATE TABLE t (c text);
INSERT INTO t (c) VALUES ('foo'), (E'\n\\.\n'), ('bar');
-- Works OK:
COPY t TO '/tmp/t.csv' WITH CSV;
TRUNCATE t;
COPY t FROM '/tmp/t.csv' WITH CSV;
-- Doesn't work:
\COPY t TO '/tmp/t.csv' WITH CSV;
TRUNCATE t;
\COPY t FROM '/tmp/t.csv' WITH CSV;
ERROR: unterminated CSV quoted field
CONTEXT: COPY t, line 4: ""
\.
"
/Joel