Re: Fixing backslash dot for COPY FROM...CSV - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fixing backslash dot for COPY FROM...CSV
Date
Msg-id 1480171.1712349246@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fixing backslash dot for COPY FROM...CSV  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fixing backslash dot for COPY FROM...CSV
Re: Fixing backslash dot for COPY FROM...CSV
List pgsql-hackers
After some more poking at this topic, I realize that there is already
very strange and undocumented behavior for backslash-dot even in
non-CSV mode.  Create a file like this:

$ cat eofdata
foobar
foobaz\.
more
\.
yet more

and try importing it with COPY:

regression=# create table eofdata(f1 text);
CREATE TABLE
regression=# copy eofdata from '/home/tgl/pgsql/eofdata';
COPY 2
regression=# table eofdata;
   f1   
--------
 foobar
 foobaz
(2 rows)

That's what you get in 9.0 and earlier versions, and it's already
not-as-documented, because we claim that only \. alone on a line is an
EOF marker; we certainly don't suggest that what's in front of it will
be taken as valid data.  However, somebody broke it some more in 9.1,
because 9.1 up to HEAD produce this result:

regression=# create table eofdata(f1 text);
CREATE TABLE
regression=# copy eofdata from '/home/tgl/pgsql/eofdata';
COPY 3
regression=# table eofdata;
   f1   
--------
 foobar
 foobaz
 more
(3 rows)

So the current behavior is that \. that is on the end of a line,
but is not the whole line, is silently discarded and we keep going.

All versions throw "end-of-copy marker corrupt" if there is
something after \. on the same line.

This is sufficiently weird that I'm starting to come around to
Daniel's original proposal that we just drop the server's recognition
of \. altogether (which would allow removal of some dozens of lines of
complicated and now known-buggy code).  Alternatively, we could fix it
so that \. at the end of a line draws "end-of-copy marker corrupt",
which would at least make things consistent, but I'm not sure that has
any great advantage.  I surely don't want to document the current
behavioral details as being the right thing that we're going to keep
doing.

Thoughts?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: meson vs windows perl
Next
From: Jeff Davis
Date:
Subject: Re: Improve eviction algorithm in ReorderBuffer