Thread: COPY incorrectly uses null instead of an empty string in last field
Release 7.2: Where the last field of a line contains an empty string, COPY incorrectly inserts a NULL. This will cause data to be loaded incorrectly from pg_dump. This happens because, when a newline is seen, COPY needs to know whether there was a preceding delimiter and use an empty string in that case; there is a difference between a last field that is an empty string and one or more last fields that are completely unspecified. Changing this should not affect COPY TO or pg_dump, which already put out \N in the last field if it really is NULL. For example: -- -- Selected TOC Entries: -- \connect - "olly" -- -- TOC Entry ID 2 (OID 1522144) -- -- Name: schau Type: TABLE Owner: olly -- CREATE TABLE "schau" ("feld1" text NOT NULL,"feld2" character varying(10) NOT NULL,Constraint "pk_schau" Primary Key ("feld1","feld2") ); -- -- Data for TOC Entry ID 3 (OID 1522144) -- -- Name: schau Type: TABLE DATA Owner: olly -- COPY "schau" FROM stdin; Helge Arne Alle Keiner \. I tried to make a fix, but managed to break it completely :-( -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C "But God commendeth his love toward us, in that, while we were yet sinners, Christ died for us." Romans 5:8
Oliver Elphick <olly@lfix.co.uk> writes: > Where the last field of a line contains an empty string, COPY > incorrectly inserts a NULL. Certainly not. Using your example table definition, I did: regression=# insert into schau values ('test','foo'); INSERT 146293 1 regression=# insert into schau values ('test2',''); INSERT 146294 1 regression=# insert into schau values ('test2',null); ERROR: ExecAppend: Fail to add null value in not null attribute feld2 regression=# copy schau to '/tmp/schau.out'; COPY regression=# delete from schau; DELETE 2 regression=# copy schau from '/tmp/schau.out'; COPY regression=# select * from schau ;feld1 | feld2 -------+-------test | footest2 | (2 rows) regression=# select * from schau where feld2 is null;feld1 | feld2 -------+------- (0 rows) The contents of /tmp/schau.out are: test foo test2 (there's a tab after test2, in case you can't see it ;-)) I don't see any problem here. regards, tom lane
On Tue, 2002-02-26 at 19:02, Tom Lane wrote: > > I did it this morning on getting the Debian bug report (against 7.1.3) > > and it failed. I repeated it just now to check after reading your reply > > and it succeeded, using exactly the same input file (in which there is > > indeed a tab after the first field before the empty string.) And I was > > certainly running 7.2 both times. > > What the heck? Something is weird there. Maybe there is some > additional condition needed to cause a problem. Can you send me the bug > report? I will if I can make it happen again. The only thing I can think of at the moment is that this morning I may have used cut-and-paste to write the input file or to read in the commands, either of which would have lost the tabs. The original reporter's mail had no tabs at all, but it was not an attachment and it had had lost all its tabs along the way. His input script came from 7.0 pg_dump. I am following up with him whether there are actually tabs before the empty fields in that pg_dump output. -- Oliver Elphick Oliver.Elphick@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839 932A 614D 4C34 3E1D 0C1C "But God commendeth his love toward us, in that, while we were yet sinners, Christ died for us." Romans 5:8
> The original reporter's mail had no tabs at all, but it was not an > attachment and it had had lost all its tabs along the way. His input > script came from 7.0 pg_dump. I am following up with him whether there > are actually tabs before the empty fields in that pg_dump output. If a trailing tab got lost in a dump file, then COPY IN would silently assume that fields after the tab position should be NULL. I suspect that that is what happened here. Probably some "helpful" program stripped trailing whitespace from the file. One of the things we've agreed to do in 7.3 is change COPY IN to remove that assumption --- a line with too few fields (too few tabs) will draw an error report instead of silently doing what's likely the wrong thing. regards, tom lane
Re: COPY incorrectly uses null instead of an empty string in last field
From
"Zeugswetter Andreas SB SD"
Date:
> One of the things we've agreed to do in 7.3 is change COPY IN to remove > that assumption --- a line with too few fields (too few tabs) will draw > an error report instead of silently doing what's likely the wrong thing. But there will be new syntax for COPY, that allows missing trailing columns. I hope. Andreas
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: >> One of the things we've agreed to do in 7.3 is change COPY IN to remove >> that assumption --- a line with too few fields (too few tabs) will draw >> an error report instead of silently doing what's likely the wrong thing. > But there will be new syntax for COPY, that allows missing trailing columns. > I hope. Why? regards, tom lane
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > But there will be new syntax for COPY, that allows missing trailing columns. > I hope. >> >> Why? > Well, good question. For one for backwards compatibility. It's an undocumented feature. How many people are likely to be using it? > I guess I would prefer COPY syntax that allows you to specify columns > as has been previously discussed. Yes, that's on the to-do list as well. But no matter what the expected set of columns is, COPY ought to complain if a line is missing some fields. regards, tom lane
Re: COPY incorrectly uses null instead of an empty string in last field
From
"Zeugswetter Andreas SB SD"
Date:
> >> One of the things we've agreed to do in 7.3 is change COPY IN to remove > >> that assumption --- a line with too few fields (too few tabs) will draw > >> an error report instead of silently doing what's likely the wrong thing. > > > But there will be new syntax for COPY, that allows missing trailing columns. > > I hope. > > Why? Well, good question. For one for backwards compatibility. I guess I would prefer COPY syntax that allows you to specify columns as has been previously discussed. Having that, that would be sufficient and safer than only a switch. COPY 'afile' to atab (a1, a3, a5, a2) Andreas