Thread: COPY Error Message is Confusing
I just spent the morning chasing down a small data corruption. It showed up when loading a database from the output of a dump. The error message was: copy: line 8590351, Bad date external representation '04-0| ' I suggest this be changed to: copy: input tuple 8590351, Bad date external representation '04-0| ' After investigating this it turns out the number reported is a 1-based input record number. Referring to it as a line number is very confusing because records may span line boundaries. The following other interpretations are credible: A line number in the dump file A line number relative to the start of the COPY. It would also be useful to report the name of the table being copied to. It would be really useful if it would output the offending input line(s) content though that might have security related issues. --------- Bryan White, ArcaMax.com, VP of Technology This email represents the consensus opinion of the many voices in my head.
--- Bryan White <bryan@arcamax.com> wrote: > I suggest this be changed to: > > copy: input tuple 8590351, Bad date external > representation '04-0| ' It's not strictly a "tuple" until it's been loaded. > > After investigating this it turns out the number > reported is a 1-based input > record number. Referring to it as a line number is > very confusing because > records may span line boundaries. Not so with COPY. The record separator is hard-coded to be a newline: the field separator can be set at runtime, but the record separator cannot. That would be a nice feature to have, though. > > It would also be useful to report the name of the > table being copied to. It > would be really useful if it would output the > offending input line(s) > content though that might have security related > issues. > Various people have wished for an import application with more intelligence than COPY now has. No doubt much of this could be achieved simply by building extra features into COPY. With about three more years of study, I might have the competency to attempt that myself. In the meantime, is anyone else volunteering? :-) > > --------- > Bryan White, ArcaMax.com, VP of Technology > This email represents the consensus opinion > of the many voices in my head. > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 2: you can get off all lists at once with the > unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) __________________________________________________ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/
> It's not strictly a "tuple" until it's been loaded. I guess that depends on your definition of 'tuple'. Are the rows returned by a select statment tuples if the select is a join of multiple tables. I tend to think of a tuple as an ordered set of values but maybe I have it wrong. In any event anyone of 'tuple', 'record', or 'row' would be less confusing than 'line'. > Not so with COPY. The record separator is hard-coded > to be a newline: the field separator can be set at > runtime, but the record separator cannot. That would > be a nice feature to have, though. The record separator is hard coded but it may occur in the data. If it occurs in the data it will be escaped but this fact eludes my text editor. The fact that the current error message refers to a line number is confusing. I can find the offending record by line or by tuple/record/row number, it just would help if the error message was clear about what it meant. > Various people have wished for an import application > with more intelligence than COPY now has. No doubt > much of this could be achieved simply by building > extra features into COPY. This existing functionaly serves my needs. I just find the message confusing and think a minor change in verbage would make it less so. --------- Bryan White, ArcaMax.com, VP of Technology This email represents the consensus opinion of the many voices in my head.
"Bryan White" <bryan@arcamax.com> writes: >> It's not strictly a "tuple" until it's been loaded. > I guess that depends on your definition of 'tuple'. Are the rows returned > by a select statment tuples if the select is a join of multiple tables. I > tend to think of a tuple as an ordered set of values but maybe I have it > wrong. In any event anyone of 'tuple', 'record', or 'row' would be less > confusing than 'line'. I agree that 'line' seems confusing in the presence of escaped newlines. I prefer 'row' or possibly 'record' to 'tuple', however. 'tuple' strikes me as unnecessarily jargon-ish in this context. regards, tom lane