Thread: COPY Error Message is Confusing

COPY Error Message is Confusing

From
"Bryan White"
Date:
I just spent the morning chasing down a small data corruption.  It showed up
when loading a database from the output of a dump.  The error message was:

    copy: line 8590351, Bad date external representation '04-0| '

I suggest this be changed to:

    copy: input tuple 8590351, Bad date external representation '04-0| '

After investigating this it turns out the number reported is a 1-based input
record number.  Referring to it as a line number is very confusing because
records may span line boundaries.  The following other interpretations are
credible:
    A line number in the dump file
    A line number relative to the start of the COPY.

It would also be useful to report the name of the table being copied to.  It
would be really useful if it would output the offending input line(s)
content though that might have security related issues.


---------
Bryan White, ArcaMax.com, VP of Technology
This email represents the consensus opinion
of the many voices in my head.



Re: COPY Error Message is Confusing

From
Jeff Eckermann
Date:
--- Bryan White <bryan@arcamax.com> wrote:


> I suggest this be changed to:
>
>     copy: input tuple 8590351, Bad date external
> representation '04-0| '

It's not strictly a "tuple" until it's been loaded.

>
> After investigating this it turns out the number
> reported is a 1-based input
> record number.  Referring to it as a line number is
> very confusing because
> records may span line boundaries.

Not so with COPY.  The record separator is hard-coded
to be a newline: the field separator can be set at
runtime, but the record separator cannot.  That would
be a nice feature to have, though.
>
> It would also be useful to report the name of the
> table being copied to.  It
> would be really useful if it would output the
> offending input line(s)
> content though that might have security related
> issues.
>

Various people have wished for an import application
with more intelligence than COPY now has.  No doubt
much of this could be achieved simply by building
extra features into COPY.

With about three more years of study, I might have the
competency to attempt that myself.  In the meantime,
is anyone else volunteering? :-)

>
> ---------
> Bryan White, ArcaMax.com, VP of Technology
> This email represents the consensus opinion
> of the many voices in my head.
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the
> unregister command
>     (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)


__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

Re: COPY Error Message is Confusing

From
"Bryan White"
Date:
> It's not strictly a "tuple" until it's been loaded.

I guess that depends on your definition of 'tuple'.  Are the rows returned
by a select statment tuples if the select is a join of multiple tables.  I
tend to think of a tuple as an ordered set of values but maybe I have it
wrong.  In any event anyone of 'tuple', 'record', or 'row' would be less
confusing than 'line'.

> Not so with COPY.  The record separator is hard-coded
> to be a newline: the field separator can be set at
> runtime, but the record separator cannot.  That would
> be a nice feature to have, though.

The record separator is hard coded but it may occur in the data.  If it
occurs in the data it will be escaped but this fact eludes my text editor.
The fact that the current error message refers to a line number is
confusing.  I can find the offending record by line or by tuple/record/row
number, it just would help if the error message was clear about what it
meant.

> Various people have wished for an import application
> with more intelligence than COPY now has.  No doubt
> much of this could be achieved simply by building
> extra features into COPY.

This existing functionaly serves my needs.  I just find the message
confusing and think a minor change in verbage would make it less so.
---------
Bryan White, ArcaMax.com, VP of Technology
This email represents the consensus opinion
of the many voices in my head.



Re: COPY Error Message is Confusing

From
Tom Lane
Date:
"Bryan White" <bryan@arcamax.com> writes:
>> It's not strictly a "tuple" until it's been loaded.

> I guess that depends on your definition of 'tuple'.  Are the rows returned
> by a select statment tuples if the select is a join of multiple tables.  I
> tend to think of a tuple as an ordered set of values but maybe I have it
> wrong.  In any event anyone of 'tuple', 'record', or 'row' would be less
> confusing than 'line'.

I agree that 'line' seems confusing in the presence of escaped newlines.

I prefer 'row' or possibly 'record' to 'tuple', however.  'tuple'
strikes me as unnecessarily jargon-ish in this context.

            regards, tom lane