Thread: COPY problems with psql / libpq

COPY problems with psql / libpq

From
"Oliver Elphick"
Date:
[Using current cvs] I have a problem with COPY when called like this:
 psql -e bray </tmp/ol

with one particular file, containing commands and data like this:

========================================
copy address from stdin;
1  Some place, Regal Way     Somewhere Oxon AB1 3CF  [ Tel: 01367 888888 ]   NoDeliveries Pm Fridays        GB \N \N \N
\N\N \N \N \N \N \N \N \N
 
...
1000     73 Some Road      London      SW1 1ZZ  GB 44 81 999   9999     \N  \N \N \N \N \N \N \N
\.
-- 1000 records written

select count(*) from address;

copy address from stdin;
1001...
... and so on up to 3916 records in total, divided into 1000 record chunks
========================================

psql or libpq seems to choke on the data, so that some spurious error
arises, such as null input into a non-null field.  Thereafter, libpq
seems to get stuck in a COPY state:

copy address from stdin;
-- 1000 records written
select count(*) from address;
PQexec: you gotta get out of a COPY state yourself.

(The comment in PQexec says it does this to preserve backwards
compatibility, but getting stuck in COPY state is not backwards
compatible!)

If I remove the SQL commands from the input file, go into psql and do:
 copy address from '/tmp/ol';

all 3916 records are added correctly.  This seems to indicate that
the problem is not in the backend.

I found that if I broke the first 1000 records into 2 equal parts, all
of them were added correctly without error; so I conclude that data
is being buffered and lost somewhere in psql or libpq, and the problem is
dependent on the amount of data being copied.

This began to happen within the last week, but I don't know which
recent change is responsible.

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver              PGP key from public servers; key
ID32B8FAA1                ========================================    "Neither is there salvation in any other; for
thereis      none other name under heaven given among men, whereby      we must be saved."               Acts 4:12 
 




Re: [HACKERS] COPY problems with psql / libpq

From
Tom Lane
Date:
"Oliver Elphick" <olly@lfix.co.uk> writes:
> I found that if I broke the first 1000 records into 2 equal parts, all
> of them were added correctly without error; so I conclude that data
> is being buffered and lost somewhere in psql or libpq, and the problem is
> dependent on the amount of data being copied.

I have the following note in my (much too long) to-do list:

: psql.c doesn't appear to cope correctly with quoted newlines in COPY data;
: if one falls just after a buffer boundary, trouble!
: Does fe-exec.c work either??

(This note is some months old, and may or may not still apply since
Peter's rework of psql.)  It could be that your dataset is hitting this
problem or a similar one.  A buffer-boundary problem would explain why
the error seems to be so dataset-specific.

> copy address from stdin;
> -- 1000 records written
> select count(*) from address;
> PQexec: you gotta get out of a COPY state yourself.

It sure sounds like psql is failing to recognize the trailing \.
of the COPY data.
        regards, tom lane


Re: [HACKERS] COPY problems with psql / libpq

From
Patrick Welche
Date:
On Thu, Jan 20, 2000 at 11:02:31AM -0500, Tom Lane wrote:
> 
> It sure sounds like psql is failing to recognize the trailing \.
> of the COPY data.

Precisely what I saw yesterday (cf Subject: pg_dump disaster) - but what
does one do about it?

Cheers,

Patrick


Re: [HACKERS] COPY problems with psql / libpq

From
Alfred Perlstein
Date:
* Patrick Welche <prlw1@newn.cam.ac.uk> [000120 09:10] wrote:
> On Thu, Jan 20, 2000 at 11:02:31AM -0500, Tom Lane wrote:
> > 
> > It sure sounds like psql is failing to recognize the trailing \.
> > of the COPY data.
> 
> Precisely what I saw yesterday (cf Subject: pg_dump disaster) - but what
> does one do about it?

Is this with a recent snapshot or 6.5.3 using libpq?  
Either way, you should check the contents of the send buffer, please let
me know if there is data queued in it.  You can include the 'internal'
header for libpq (libpq-int.h?) to get at the send buffer.

-Alfred


Re: [HACKERS] COPY problems with psql / libpq

From
Patrick Welche
Date:
On Thu, Jan 20, 2000 at 09:22:16AM -0800, Alfred Perlstein wrote:
> * Patrick Welche <prlw1@newn.cam.ac.uk> [000120 09:10] wrote:
> > On Thu, Jan 20, 2000 at 11:02:31AM -0500, Tom Lane wrote:
> > > 
> > > It sure sounds like psql is failing to recognize the trailing \.
> > > of the COPY data.
> > 
> > Precisely what I saw yesterday (cf Subject: pg_dump disaster) - but what
> > does one do about it?
> 
> Is this with a recent snapshot or 6.5.3 using libpq?  

For me, it's using yesterday's cvs'd source - but I obviously can't speak
for Oliver.

> Either way, you should check the contents of the send buffer, please let
> me know if there is data queued in it.  You can include the 'internal'
> header for libpq (libpq-int.h?) to get at the send buffer.

That will take a while. In the meantime, just pg_dumpall something and try
to read the output back in. I do have ^M's in some of the text columns if
that matters.

Cheers,

Patrick


Re: [HACKERS] COPY problems with psql / libpq

From
"Oliver Elphick"
Date:
Patrick Welche wrote: >On Thu, Jan 20, 2000 at 09:22:16AM -0800, Alfred Perlstein wrote: >> * Patrick Welche
<prlw1@newn.cam.ac.uk>[000120 09:10] wrote: >> > On Thu, Jan 20, 2000 at 11:02:31AM -0500, Tom Lane wrote: >> > >  >> >
>It sure sounds like psql is failing to recognize the trailing \. >> > > of the COPY data. >> >  >> > Precisely what I
sawyesterday (cf Subject: pg_dump disaster) - but what >> > does one do about it? >>  >> Is this with a recent snapshot
or6.5.3 using libpq?   > >For me, it's using yesterday's cvs'd source - but I obviously can't speak >for Oliver. 
 
This morning's.
 >> Either way, you should check the contents of the send buffer, please let >> me know if there is data queued in it.
Youcan include the 'internal' >> header for libpq (libpq-int.h?) to get at the send buffer. > >That will take a while.
Inthe meantime, just pg_dumpall something and try >to read the output back in. I do have ^M's in some of the text
columnsif >that matters.
 

I can't do that, because pg_dump seems to be broken if there are tables
with foreign key constraints.   (See a separate message.)

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver              PGP key from public servers; key
ID32B8FAA1                ========================================    "Neither is there salvation in any other; for
thereis      none other name under heaven given among men, whereby      we must be saved."               Acts 4:12 
 




Re: [HACKERS] COPY problems with psql / libpq

From
"Oliver Elphick"
Date:
Tom Lane wrote: >"Oliver Elphick" <olly@lfix.co.uk> writes: >> I found that if I broke the first 1000 records into 2
equalparts, all >> of them were added correctly without error; so I conclude that data >> is being buffered and lost
somewherein psql or libpq, and the problem is >> dependent on the amount of data being copied. > >I have the following
notein my (much too long) to-do list: > >: psql.c doesn't appear to cope correctly with quoted newlines in COPY data;
>:if one falls just after a buffer boundary, trouble! >: Does fe-exec.c work either?? 
 
New-lines are not the problem in this particular case, since the data
does not contain any.

-- 
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver              PGP key from public servers; key
ID32B8FAA1                ========================================    "Neither is there salvation in any other; for
thereis      none other name under heaven given among men, whereby      we must be saved."               Acts 4:12 
 




Re: [HACKERS] COPY problems with psql / libpq

From
Peter Eisentraut
Date:
On 2000-01-20, Tom Lane mentioned:

> "Oliver Elphick" <olly@lfix.co.uk> writes:
> > I found that if I broke the first 1000 records into 2 equal parts, all
> > of them were added correctly without error; so I conclude that data
> > is being buffered and lost somewhere in psql or libpq, and the problem is
> > dependent on the amount of data being copied.

The buffering is line-based though and the default buffer is 8192. If
your line is longer than that you're in all sorts of other troubles.

> 
> I have the following note in my (much too long) to-do list:
> 
> : psql.c doesn't appear to cope correctly with quoted newlines in COPY data;

What's a quoted newline?
a) "<newline>"
b) "\n"
c) \<newline>

Earlier you also mentioned to me something in general about control
characters messing up COPY. Could you give me some details on that so I
can look into it?

> : if one falls just after a buffer boundary, trouble!
> : Does fe-exec.c work either??
> 
> (This note is some months old, and may or may not still apply since
> Peter's rework of psql.)  It could be that your dataset is hitting this

I haven't touched that code.

> problem or a similar one.  A buffer-boundary problem would explain why
> the error seems to be so dataset-specific.

At first I thunk a PQExpBuffer based solution would be the answer, but
as I said above, if you overflow the buffer, you're in trouble anyway.

> 
> > copy address from stdin;
> > -- 1000 records written
> > select count(*) from address;
> > PQexec: you gotta get out of a COPY state yourself.
> 
> It sure sounds like psql is failing to recognize the trailing \.
> of the COPY data.

The last call in the function psql/copy.c:handleCopyIn is
return !PQendcopy(conn);

and there is no way it can exit earlier. Also the connection seems to be
good, since that's checked right after it returns. The calls to PQputvalue
are not checked for return values, so problems might get missed there, but
that would in any case still point to a problem elsewhere. Gotta pass the
buck to libpq ...

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden





Re: [HACKERS] COPY problems with psql / libpq

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
>> : psql.c doesn't appear to cope correctly with quoted newlines in COPY data;

> What's a quoted newline?
> a) "<newline>"
> b) "\n"
> c) \<newline>

(c).  That's how a newline appearing in the data is supposed to be
represented.  IIRC, I was worried that if the \ falls at the end of
a bufferload and the newline at the start of the next, psql and/or
libpq would fail to recognize the pattern; if so, they'd probably
think the newline is a record boundary.

Patrick could be falling victim to this, but Oliver sez he has no
newlines in his data, so there's at least one other problem.

> that would in any case still point to a problem elsewhere. Gotta pass the
> buck to libpq ...

Could be.  I think Alfred is on the hook here...
        regards, tom lane