copy command -- foiled by pg_atoi - Mailing list pgsql-general

From Mike Beller
Subject copy command -- foiled by pg_atoi
Date
Msg-id 385C01E1.6876466D@tradeworx.com
Whole thread Raw
List pgsql-general
Hi

I'm using postgresql on RedHat/i386 from the RH6.1 RPMs
(Linux 2.2.5-15 PostgreSQL-6.5.2).  I've been evaluating it,
and am quite excited about the possibilty of using this outstanding
open-source code in my company's systems.

I noticed the following and want to submit it for your consideration:

When doing a 'copy' from a big text file, if there is a record, say
half way down, which has an unparseable integer field (say 'X'
in a column that shoudl be an integer) one gets the following error:

ERROR:  pg_atoi: error in "X": can't parse "X"

However, I don't get a line number reference where the error
occurred.  This is a problem when you have half a million lines
in your file, some of which  may contain legitimate X's as
well as non-legit ones.  I don't even know which attribute failed.

Simple example:  (imagine a million rows with 12 columns and
you're trying to find the problem:)

-------
create table foo (x int);
copy foo from stdin;
1
2
3
X
4
\.
-----------

I noticed that in pg_atoi there is this code:

     if (errno)                         /* strtol must set ERANGE */
            elog(ERROR, "pg_atoi: error reading \"%s\": %m", s);
     if (badp && *badp && (*badp != c))
            elog(ERROR, "pg_atoi: error in \"%s\": can\'t parse
\"%s\"",s, badp);


This is called (via int4in or some such) from copy.c:

    values[i] = (Datum) (*fmgr_faddr(&in_functions[i]))(string,
elements[i],typmod[i]);
           /*
          * Sanity check - by reference attributes cannot
           * return NULL
          */
          if (!PointerIsValid(values[i]) &&
!(rel->rd_att->attrs[i]->attbyval))
                  elog(ERROR, "copy from line %d: Bad file format",
lineno);

So the lineno information is there, it's just not available when pg_atoi
logs its error and bails out.

Is it possible for pg_atoi to return an invalid pointer instead
of completely bailing out, thus allowing the 'sanity check' code
to print the line number?  I have no idea how many things this
might break...

Regards


Mike Beller
CTO
Tradeworx.com

pgsql-general by date:

Previous
From: Ed Loehr
Date:
Subject: Re: [GENERAL] how do you unsubscribe? thanks.
Next
From: Alex Guryanow
Date:
Subject: NOTICE: (transaction aborted): queries ignored until END