Re: [GENERAL] copy command -- foiled by pg_atoi - Mailing list pgsql-general

From Jose Soares
Subject Re: [GENERAL] copy command -- foiled by pg_atoi
Date
Msg-id 385E345B.EB292FAD@sferacarta.com
Whole thread Raw
In response to copy command -- foiled by pg_atoi  (Mike Beller <beller@tradeworx.com>)
List pgsql-general
This is also my problem. I'm getting '@!?àù§èé+*_|!&/%§¸' to load a
table with more than 23,000 rows
because I don't know in which line I have to look for the the error.
It's obvious that psql reads the input file line by line, and of course
psql knows in wich line the error
occurred. This information would be very important to correct any syntax
error in the file.

Comments?

Jose'


Mike Beller wrote:
>
> Hi
>
> I'm using postgresql on RedHat/i386 from the RH6.1 RPMs
> (Linux 2.2.5-15 PostgreSQL-6.5.2).  I've been evaluating it,
> and am quite excited about the possibilty of using this outstanding
> open-source code in my company's systems.
>
> I noticed the following and want to submit it for your consideration:
>
> When doing a 'copy' from a big text file, if there is a record, say
> half way down, which has an unparseable integer field (say 'X'
> in a column that shoudl be an integer) one gets the following error:
>
> ERROR:  pg_atoi: error in "X": can't parse "X"
>
> However, I don't get a line number reference where the error
> occurred.  This is a problem when you have half a million lines
> in your file, some of which  may contain legitimate X's as
> well as non-legit ones.  I don't even know which attribute failed.
>
> Simple example:  (imagine a million rows with 12 columns and
> you're trying to find the problem:)
>
> -------
> create table foo (x int);
> copy foo from stdin;
> 1
> 2
> 3
> X
> 4
> \.
> -----------
>
> I noticed that in pg_atoi there is this code:
>
>      if (errno)                         /* strtol must set ERANGE */
>             elog(ERROR, "pg_atoi: error reading \"%s\": %m", s);
>      if (badp && *badp && (*badp != c))
>             elog(ERROR, "pg_atoi: error in \"%s\": can\'t parse
> \"%s\"",s, badp);
>
> This is called (via int4in or some such) from copy.c:
>
>     values[i] = (Datum) (*fmgr_faddr(&in_functions[i]))(string,
> elements[i],typmod[i]);
>            /*
>           * Sanity check - by reference attributes cannot
>            * return NULL
>           */
>           if (!PointerIsValid(values[i]) &&
> !(rel->rd_att->attrs[i]->attbyval))
>                   elog(ERROR, "copy from line %d: Bad file format",
> lineno);
>
> So the lineno information is there, it's just not available when pg_atoi
> logs its error and bails out.
>
> Is it possible for pg_atoi to return an invalid pointer instead
> of completely bailing out, thus allowing the 'sanity check' code
> to print the line number?  I have no idea how many things this
> might break...
>
> Regards
>
> Mike Beller
> CTO
> Tradeworx.com
>
> ************

pgsql-general by date:

Previous
From: Jeff Hoffmann
Date:
Subject: recovering a "lost" database
Next
From: "Gurunandan R. Bhat"
Date:
Subject: Trying to understand SPI