Thread: How to determine offending column for insert exceptions

How to determine offending column for insert exceptions

From
Shawn Gennaria
Date:
Hi all,

I'm attempting to parse a data set of very many columns from numerous CSVs into postgres so I can work with them more easily.  To this end, I've created some dynamic queries for table creation, copying from CSVs into a temp table, and then inserting the data to a final table with appropriate data types assigned to each field.  The majority of the data can fit into integer fields, but occasionally I hit some entries that need to be text or bigint or floats.  Therefore my dynamic queries fail with 'integer out of range' errors and such.  Unfortunately, sometimes this happens on a file with thousands of columns, and I'd like to easily figure out which column the erroneous input belongs to without having to manually scour through it.  At this point, the data has already been copied into a temp table, so the query producing these errors looks like:

INSERT INTO final_table
SELECT a::int, b::int FROM temp_table

temp_table contains all text fields (since COPY points there and I'd rather not debug at that stage), so I'm trying to coerce them to more appropriate data types with this insert statement.

From this, I'd get an error with SQLSTATE like 22003 and SQLERRM like 'value "2156947514" is out of range for type integer'.  I'd like to be able to handle the exception gracefully and modify the data type of the appropriate column, but I don't know how to determine which column contains this data.

I hope this is possible.

Thanks!
sg

Re: How to determine offending column for insert exceptions

From
Adrian Klaver
Date:
On 04/21/2015 07:39 AM, Shawn Gennaria wrote:
> Hi all,
>
> I'm attempting to parse a data set of very many columns from numerous
> CSVs into postgres so I can work with them more easily.  To this end,
> I've created some dynamic queries for table creation, copying from CSVs
> into a temp table, and then inserting the data to a final table with
> appropriate data types assigned to each field.  The majority of the data
> can fit into integer fields, but occasionally I hit some entries that
> need to be text or bigint or floats.  Therefore my dynamic queries fail
> with 'integer out of range' errors and such.  Unfortunately, sometimes
> this happens on a file with thousands of columns, and I'd like to easily
> figure out which column the erroneous input belongs to without having to
> manually scour through it.  At this point, the data has already been
> copied into a temp table, so the query producing these errors looks like:
>
> INSERT INTO final_table
> SELECT a::int, b::int FROM temp_table
>
> temp_table contains all text fields (since COPY points there and I'd
> rather not debug at that stage), so I'm trying to coerce them to more
> appropriate data types with this insert statement.
>
>  From this, I'd get an error with SQLSTATE like 22003 and SQLERRM like
> 'value "2156947514" is out of range for type integer'.  I'd like to be
> able to handle the exception gracefully and modify the data type of the
> appropriate column, but I don't know how to determine which column
> contains this data.

Not sure, but some more information might help:

1) What Postgres version?

2) You mention you are doing this dynamically.
Where is that happening?

In a stored function?
If so what language?

In an external program?

3) How are you handling the exception now?

>
> I hope this is possible.
>
> Thanks!
> sg


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: How to determine offending column for insert exceptions

From
Shawn Gennaria
Date:
1) 9.4

2) Everything is contained in a single stored plpgsql function with multiple transaction blocks to allow me to debug each stage of the process.

3) I'm currently handling exceptions with generic 'WHEN OTHERS THEN' statements to spit out the SQLSTATE and SQLERRM values to help me figure out what's going on.  I intend to focus this with statements that catch the particular errors that would arise from trying to incorrectly coerce my text data into other data types.


I'm kind of surprised I haven't been able to find answers to this in google, though I did see someone else asked a similar question on stackoverflow 6 months ago but never got an answer.  The best thing I can think of right now is to query pg_attributes to find the column names for the temp_table I'm dealing with and then loop through each one attempting to find a hit on the value that I can see embedded in SQLERRM. 

On Tue, Apr 21, 2015 at 10:59 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/21/2015 07:39 AM, Shawn Gennaria wrote:
Hi all,

I'm attempting to parse a data set of very many columns from numerous
CSVs into postgres so I can work with them more easily.  To this end,
I've created some dynamic queries for table creation, copying from CSVs
into a temp table, and then inserting the data to a final table with
appropriate data types assigned to each field.  The majority of the data
can fit into integer fields, but occasionally I hit some entries that
need to be text or bigint or floats.  Therefore my dynamic queries fail
with 'integer out of range' errors and such.  Unfortunately, sometimes
this happens on a file with thousands of columns, and I'd like to easily
figure out which column the erroneous input belongs to without having to
manually scour through it.  At this point, the data has already been
copied into a temp table, so the query producing these errors looks like:

INSERT INTO final_table
SELECT a::int, b::int FROM temp_table

temp_table contains all text fields (since COPY points there and I'd
rather not debug at that stage), so I'm trying to coerce them to more
appropriate data types with this insert statement.

 From this, I'd get an error with SQLSTATE like 22003 and SQLERRM like
'value "2156947514" is out of range for type integer'.  I'd like to be
able to handle the exception gracefully and modify the data type of the
appropriate column, but I don't know how to determine which column
contains this data.

Not sure, but some more information might help:

1) What Postgres version?

2) You mention you are doing this dynamically.
Where is that happening?

In a stored function?
If so what language?

In an external program?

3) How are you handling the exception now?



I hope this is possible.

Thanks!
sg


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: How to determine offending column for insert exceptions

From
Adrian Klaver
Date:
On 04/21/2015 08:07 AM, Shawn Gennaria wrote:
> 1) 9.4
>
> 2) Everything is contained in a single stored plpgsql function with
> multiple transaction blocks to allow me to debug each stage of the process.
>
> 3) I'm currently handling exceptions with generic 'WHEN OTHERS THEN'
> statements to spit out the SQLSTATE and SQLERRM values to help me figure
> out what's going on.  I intend to focus this with statements that catch
> the particular errors that would arise from trying to incorrectly coerce
> my text data into other data types.
From psql.

test=# \d int_test         Table "public.int_test"  Column  |       Type        | Modifiers
----------+-------------------+----------- int_fld  | integer           | var_fld  | character varying | test_col |
integer          |
 



test=# insert into int_test values (1, 'test', '2015-04-21'::date);
ERROR:  column "test_col" is of type integer but expression is of type date
LINE 1: insert into int_test values (1, 'test', '2015-04-21'::date);                                                ^
HINT:  You will need to rewrite or cast the expression.

So the information is there.

The choices would seem to be:

1) Add a bare RAISE to your EXCEPTION block to get the original error to 
appear.


http://www.postgresql.org/docs/9.4/interactive/plpgsql-errors-and-messages.html


See the thread below for a similar example;

http://www.postgresql.org/message-id/CAKFQuwbeQBOFPOn1bk9P3uGujMPW13f+hsjjR3D8mJ=jtVAD+A@mail.gmail.com


2) Or from here:

http://www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING

see 40.6.6.1. Obtaining Information About an Error.

>
> I'm kind of surprised I haven't been able to find answers to this in
> google, though I did see someone else asked a similar question on
> stackoverflow 6 months ago but never got an answer.  The best thing I
> can think of right now is to query pg_attributes to find the column
> names for the temp_table I'm dealing with and then loop through each one
> attempting to find a hit on the value that I can see embedded in SQLERRM.
>
> On Tue, Apr 21, 2015 at 10:59 AM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 04/21/2015 07:39 AM, Shawn Gennaria wrote:
>
>         Hi all,
>
>         I'm attempting to parse a data set of very many columns from
>         numerous
>         CSVs into postgres so I can work with them more easily.  To this
>         end,
>         I've created some dynamic queries for table creation, copying
>         from CSVs
>         into a temp table, and then inserting the data to a final table with
>         appropriate data types assigned to each field.  The majority of
>         the data
>         can fit into integer fields, but occasionally I hit some entries
>         that
>         need to be text or bigint or floats.  Therefore my dynamic
>         queries fail
>         with 'integer out of range' errors and such.  Unfortunately,
>         sometimes
>         this happens on a file with thousands of columns, and I'd like
>         to easily
>         figure out which column the erroneous input belongs to without
>         having to
>         manually scour through it.  At this point, the data has already been
>         copied into a temp table, so the query producing these errors
>         looks like:
>
>         INSERT INTO final_table
>         SELECT a::int, b::int FROM temp_table
>
>         temp_table contains all text fields (since COPY points there and I'd
>         rather not debug at that stage), so I'm trying to coerce them to
>         more
>         appropriate data types with this insert statement.
>
>           From this, I'd get an error with SQLSTATE like 22003 and
>         SQLERRM like
>         'value "2156947514 <tel:2156947514>" is out of range for type
>         integer'.  I'd like to be
>         able to handle the exception gracefully and modify the data type
>         of the
>         appropriate column, but I don't know how to determine which column
>         contains this data.
>
>
>     Not sure, but some more information might help:
>
>     1) What Postgres version?
>
>     2) You mention you are doing this dynamically.
>     Where is that happening?
>
>     In a stored function?
>     If so what language?
>
>     In an external program?
>
>     3) How are you handling the exception now?
>
>
>
>         I hope this is possible.
>
>         Thanks!
>         sg
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: How to determine offending column for insert exceptions

From
Shawn Gennaria
Date:
OK, I'm looking at
www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
which I completely missed before.  This sounds like my answer, but it's not returning anything when I try to extract the COLUMN_NAME.  In desperation, I tried grabbing every value in that table, and it just repeats the same info I already had via SQLSTATE and SQLERRM.

Here's what my overall implementation looks like:

DECLARE
  col_name text;
  sql_state text;
  ...
BEGIN
  FOR rec IN (
    SELECT 1 file per row: info about each of my csv files to dynamically build the tables, copy and insert data
  ) LOOP
    ...
    QRY_INSERT := 'INSERT INTO rec.final_table SELECT rec.inserts FROM rec.temp_table'; -- rec.inserts is text formed like 'a::int, b::int,...'
    BEGIN
      EXECUTE QRY_INSERT;
    EXCEPTION
      WHEN OTHERS THEN
        GET STACKED DIAGNOSTICS col_name = COLUMN_NAME, sql_state = RETURNED_SQLSTATE ...etc...
        RAISE INFO '%, %, ......', col_name, sql_state, ......;
    END;
  END LOOP;
END;

The only values I get back are:
RETURNED_SQLSTATE = 22003
MESSAGE_TEXT = 'value "2156947514" is out of range for type integer
PG_EXCEPTION_CONTEXT = SQL statement

The rest are null.  I'm confused why your error message was more informative.

I tried leaving everything else out of the exception and just using a bare RAISE like you said, but that just put the same exact error message out to my Messages tab in pgAdmin-- no mention of any columns.



On Tue, Apr 21, 2015 at 11:38 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 04/21/2015 08:07 AM, Shawn Gennaria wrote:
1) 9.4

2) Everything is contained in a single stored plpgsql function with
multiple transaction blocks to allow me to debug each stage of the process.

3) I'm currently handling exceptions with generic 'WHEN OTHERS THEN'
statements to spit out the SQLSTATE and SQLERRM values to help me figure
out what's going on.  I intend to focus this with statements that catch
the particular errors that would arise from trying to incorrectly coerce
my text data into other data types.

>From psql.

test=# \d int_test
         Table "public.int_test"
  Column  |       Type        | Modifiers
----------+-------------------+-----------
 int_fld  | integer           |
 var_fld  | character varying |
 test_col | integer           |



test=# insert into int_test values (1, 'test', '2015-04-21'::date);
ERROR:  column "test_col" is of type integer but expression is of type date
LINE 1: insert into int_test values (1, 'test', '2015-04-21'::date);
                                                ^
HINT:  You will need to rewrite or cast the expression.

So the information is there.

The choices would seem to be:

1) Add a bare RAISE to your EXCEPTION block to get the original error to appear.


http://www.postgresql.org/docs/9.4/interactive/plpgsql-errors-and-messages.html


See the thread below for a similar example;

http://www.postgresql.org/message-id/CAKFQuwbeQBOFPOn1bk9P3uGujMPW13f+hsjjR3D8mJ=jtVAD+A@mail.gmail.com


2) Or from here:

http://www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING

see 40.6.6.1. Obtaining Information About an Error.


I'm kind of surprised I haven't been able to find answers to this in
google, though I did see someone else asked a similar question on
stackoverflow 6 months ago but never got an answer.  The best thing I
can think of right now is to query pg_attributes to find the column
names for the temp_table I'm dealing with and then loop through each one
attempting to find a hit on the value that I can see embedded in SQLERRM.

On Tue, Apr 21, 2015 at 10:59 AM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 04/21/2015 07:39 AM, Shawn Gennaria wrote:

        Hi all,

        I'm attempting to parse a data set of very many columns from
        numerous
        CSVs into postgres so I can work with them more easily.  To this
        end,
        I've created some dynamic queries for table creation, copying
        from CSVs
        into a temp table, and then inserting the data to a final table with
        appropriate data types assigned to each field.  The majority of
        the data
        can fit into integer fields, but occasionally I hit some entries
        that
        need to be text or bigint or floats.  Therefore my dynamic
        queries fail
        with 'integer out of range' errors and such.  Unfortunately,
        sometimes
        this happens on a file with thousands of columns, and I'd like
        to easily
        figure out which column the erroneous input belongs to without
        having to
        manually scour through it.  At this point, the data has already been
        copied into a temp table, so the query producing these errors
        looks like:

        INSERT INTO final_table
        SELECT a::int, b::int FROM temp_table

        temp_table contains all text fields (since COPY points there and I'd
        rather not debug at that stage), so I'm trying to coerce them to
        more
        appropriate data types with this insert statement.

          From this, I'd get an error with SQLSTATE like 22003 and
        SQLERRM like
        'value "2156947514 <tel:2156947514>" is out of range for type
        integer'.  I'd like to be
        able to handle the exception gracefully and modify the data type
        of the
        appropriate column, but I don't know how to determine which column
        contains this data.


    Not sure, but some more information might help:

    1) What Postgres version?

    2) You mention you are doing this dynamically.
    Where is that happening?

    In a stored function?
    If so what language?

    In an external program?

    3) How are you handling the exception now?



        I hope this is possible.

        Thanks!
        sg



    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: How to determine offending column for insert exceptions

From
Alvaro Herrera
Date:
Shawn Gennaria wrote:
> OK, I'm looking at
> www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
> which I completely missed before.  This sounds like my answer, but it's not
> returning anything when I try to extract the COLUMN_NAME.

As far as I recall, COLUMN_NAME is new in 9.4.  If you're trying with an
earlier version, you can't get that info other than by parsing the error
message string.


-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: How to determine offending column for insert exceptions

From
Shawn Gennaria
Date:
<div dir="ltr">Glad to hear it's here just in time.  I am using 9.4, though, so I wish I could figure out why it's
returningNULL when I use it.  And the error message string doesn't contain any column name to parse in my output.<br
/></div><divclass="gmail_extra"><br /><div class="gmail_quote">On Tue, Apr 21, 2015 at 12:46 PM, Alvaro Herrera <span
dir="ltr"><<ahref="mailto:alvherre@2ndquadrant.com" target="_blank">alvherre@2ndquadrant.com</a>></span>
wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">ShawnGennaria wrote:<br /> > OK, I'm looking at<br /> > <a
href="http://www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES"
target="_blank">www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES</a><br
/>> which I completely missed before.  This sounds like my answer, but it's not<br /> > returning anything when I
tryto extract the COLUMN_NAME.<br /><br /></span>As far as I recall, COLUMN_NAME is new in 9.4.  If you're trying with
an<br/> earlier version, you can't get that info other than by parsing the error<br /> message string.<br /><span
class="HOEnZb"><fontcolor="#888888"><br /><br /> --<br /> Álvaro Herrera                <a
href="http://www.2ndQuadrant.com/"target="_blank">http://www.2ndQuadrant.com/</a><br /> PostgreSQL Development, 24x7
Support,Remote DBA, Training & Services<br /></font></span></blockquote></div><br /></div> 

Re: How to determine offending column for insert exceptions

From
"David G. Johnston"
Date:
On Tue, Apr 21, 2015 at 9:48 AM, Shawn Gennaria <sgennaria2@gmail.com> wrote:
Glad to hear it's here just in time.  I am using 9.4, though, so I wish I could figure out why it's returning NULL when I use it.  And the error message string doesn't contain any column name to parse in my output.

On Tue, Apr 21, 2015 at 12:46 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Shawn Gennaria wrote:
> OK, I'm looking at
> www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
> which I completely missed before.  This sounds like my answer, but it's not
> returning anything when I try to extract the COLUMN_NAME.

As far as I recall, COLUMN_NAME is new in 9.4.  If you're trying with an
earlier version, you can't get that info other than by parsing the error
message string.


​Adrian provided an example of trying to place a valid date into a non-date column.  The data itself was correct but the place it is being stored to is invalid and can be reported explicitly.

Shawn provided an example of trying to create an integer using an invalid value.  Type input errors are not column specific and so the error - which is basically a parse error - does not provide column information.

There is likely some more experimenting here, and maybe room to attached optional contextual markers to make better error messages, but fundamentally these are two different kinds of errors which cannot be generalized over to the extent of saying "errors should provide column name information"...because what would you do for { SELECT 'a'::int }?

David J.

Re: How to determine offending column for insert exceptions

From
Seth Gordon
Date:
value "2156947514" is out of range for type integer

I suspect that you are using a field of type INTEGER to store a North American telephone number with its area code. Unfortunately, the maximum value of an INTEGER field is 2147483647.

If you must use a numeric type for this purpose, use NUMERIC(10), which will store any ten-digit number.

On Tue, Apr 21, 2015 at 1:04 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tue, Apr 21, 2015 at 9:48 AM, Shawn Gennaria <sgennaria2@gmail.com> wrote:
Glad to hear it's here just in time.  I am using 9.4, though, so I wish I could figure out why it's returning NULL when I use it.  And the error message string doesn't contain any column name to parse in my output.

On Tue, Apr 21, 2015 at 12:46 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Shawn Gennaria wrote:
> OK, I'm looking at
> www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
> which I completely missed before.  This sounds like my answer, but it's not
> returning anything when I try to extract the COLUMN_NAME.

As far as I recall, COLUMN_NAME is new in 9.4.  If you're trying with an
earlier version, you can't get that info other than by parsing the error
message string.


​Adrian provided an example of trying to place a valid date into a non-date column.  The data itself was correct but the place it is being stored to is invalid and can be reported explicitly.

Shawn provided an example of trying to create an integer using an invalid value.  Type input errors are not column specific and so the error - which is basically a parse error - does not provide column information.

There is likely some more experimenting here, and maybe room to attached optional contextual markers to make better error messages, but fundamentally these are two different kinds of errors which cannot be generalized over to the extent of saying "errors should provide column name information"...because what would you do for { SELECT 'a'::int }?

David J.

Re: How to determine offending column for insert exceptions

From
s d
Date:
I'm not sure, but I have a guess, why the column_name is null.
You wrote your queries look like this:

INSERT INTO final_table
SELECT a::int, b::int FROM temp_table

The error comes from the result expression of the select statement where the source column can't be determined exactly.

Regards,
Sándor Daku


On 21 April 2015 at 18:48, Shawn Gennaria <sgennaria2@gmail.com> wrote:
Glad to hear it's here just in time.  I am using 9.4, though, so I wish I could figure out why it's returning NULL when I use it.  And the error message string doesn't contain any column name to parse in my output.

On Tue, Apr 21, 2015 at 12:46 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Shawn Gennaria wrote:
> OK, I'm looking at
> www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
> which I completely missed before.  This sounds like my answer, but it's not
> returning anything when I try to extract the COLUMN_NAME.

As far as I recall, COLUMN_NAME is new in 9.4.  If you're trying with an
earlier version, you can't get that info other than by parsing the error
message string.


--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: How to determine offending column for insert exceptions

From
Shawn Gennaria
Date:
David,

Thanks for the insight.  Indeed, I could not replicate Adrian's error message by substituting his date example in my code.  It just gives me a generic 'cannot cast type date to integer' with no mention of a column name.  I think I better understand how the context affects the ability to provide certain information in error messages.

I'll attempt to solve my problem by querying pg_attribute for the columns of the table I'm dealing with and then I'll just loop over them all until I get a hit on the value returned in my error message.  It won't be pretty, but it's better than browsing thousands of columns in CSVs trying to find these pitfalls.

Thank you all for the assist!


On Tue, Apr 21, 2015 at 1:04 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tue, Apr 21, 2015 at 9:48 AM, Shawn Gennaria <sgennaria2@gmail.com> wrote:
Glad to hear it's here just in time.  I am using 9.4, though, so I wish I could figure out why it's returning NULL when I use it.  And the error message string doesn't contain any column name to parse in my output.

On Tue, Apr 21, 2015 at 12:46 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Shawn Gennaria wrote:
> OK, I'm looking at
> www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
> which I completely missed before.  This sounds like my answer, but it's not
> returning anything when I try to extract the COLUMN_NAME.

As far as I recall, COLUMN_NAME is new in 9.4.  If you're trying with an
earlier version, you can't get that info other than by parsing the error
message string.


​Adrian provided an example of trying to place a valid date into a non-date column.  The data itself was correct but the place it is being stored to is invalid and can be reported explicitly.

Shawn provided an example of trying to create an integer using an invalid value.  Type input errors are not column specific and so the error - which is basically a parse error - does not provide column information.

There is likely some more experimenting here, and maybe room to attached optional contextual markers to make better error messages, but fundamentally these are two different kinds of errors which cannot be generalized over to the extent of saying "errors should provide column name information"...because what would you do for { SELECT 'a'::int }?

David J.


Re: How to determine offending column for insert exceptions

From
Adrian Klaver
Date:
On 04/21/2015 09:37 AM, Shawn Gennaria wrote:
> OK, I'm looking at
> www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
>
<http://www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES>
>
> which I completely missed before.  This sounds like my answer, but it's
> not returning anything when I try to extract the COLUMN_NAME.  In
> desperation, I tried grabbing every value in that table, and it just
> repeats the same info I already had via SQLSTATE and SQLERRM.
>
> Here's what my overall implementation looks like:
>
> DECLARE
>    col_name text;
>    sql_state text;
>    ...
> BEGIN
>    FOR rec IN (
>      SELECT 1 file per row: info about each of my csv files to
> dynamically build the tables, copy and insert data
>    ) LOOP
>      ...
>      QRY_INSERT := 'INSERT INTO rec.final_table SELECT rec.inserts FROM
> rec.temp_table'; -- rec.inserts is text formed like 'a::int, b::int,...'
>      BEGIN
>        EXECUTE QRY_INSERT;
>      EXCEPTION
>        WHEN OTHERS THEN
>          GET STACKED DIAGNOSTICS col_name = COLUMN_NAME, sql_state =
> RETURNED_SQLSTATE ...etc...
>          RAISE INFO '%, %, ......', col_name, sql_state, ......;
>      END;
>    END LOOP;
> END;
>
> The only values I get back are:
> RETURNED_SQLSTATE = 22003
> MESSAGE_TEXT = 'value "2156947514" is out of range for type integer
> PG_EXCEPTION_CONTEXT = SQL statement
>
> The rest are null.  I'm confused why your error message was more
> informative.
>
> I tried leaving everything else out of the exception and just using a
> bare RAISE like you said, but that just put the same exact error message
> out to my Messages tab in pgAdmin-- no mention of any columns.
>
>

Yeah I saw the below and did not pay enough attention to the actual 
error you posted.

"The majority of the data can fit into integer fields, but occasionally 
I hit some entries that need to be text or bigint or floats."

So as David said this is a parser error, though:

postgres@test=# \d int_test    Table "public.int_test" Column  |  Type   | Modifiers
---------+---------+----------- int_fld | integer |
       Table "public.source_tbl" Column |       Type        | Modifiers
--------+-------------------+----------- v_fld  | character varying |

postgres@test=# insert into int_test values ('2156947514'::int);
ERROR:  value "2156947514" is out of range for type integer
LINE 1: insert into int_test values ('2156947514'::int);


postgres@test=# insert into int_test select v_fld::int from source_tbl;
ERROR:  value "2156947514" is out of range for type integer

postgres@test=# select '2156947514'::int;
ERROR:  value "2156947514" is out of range for type integer
LINE 1: select '2156947514'::int;

Only part of the error in the SELECT is pushed up to the INSERT error.


The only thing I can think to do is pre-test the data in the CSV or the 
temp table for type, instead of letting the parser do it. One program I 
can point you at, assuming you are comfortable with Python,  is 
ddl-generator:

https://github.com/catherinedevlin/ddl-generator

I have used it on smaller datasets(number of columns) then you are 
working on, so I can't say how it will scale to your case.

-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: How to determine offending column for insert exceptions

From
Adrian Klaver
Date:
On 04/21/2015 10:54 AM, Shawn Gennaria wrote:
> David,
>
> Thanks for the insight.  Indeed, I could not replicate Adrian's error
> message by substituting his date example in my code.  It just gives me a
> generic 'cannot cast type date to integer' with no mention of a column
> name.  I think I better understand how the context affects the ability
> to provide certain information in error messages.

DO
$$
DECLARE    text_var1 text;    text_var2 text;    text_var3 text;
BEGIN  insert into int_test values (1, 'test', '2015-04-21'::date);  EXCEPTION WHEN OTHERS THEN    GET STACKED
DIAGNOSTICStext_var1 = MESSAGE_TEXT,                          text_var2 = PG_EXCEPTION_DETAIL,
text_var3= PG_EXCEPTION_HINT;    RAISE NOTICE '%, %, %', text_var1, text_var2, text_var3;
 
END$$;


postgres@test=# \e
NOTICE:  column "test_col" is of type integer but expression is of type 
date, , You will need to rewrite or cast the expression.
DO

>
> I'll attempt to solve my problem by querying pg_attribute for the
> columns of the table I'm dealing with and then I'll just loop over them
> all until I get a hit on the value returned in my error message.  It
> won't be pretty, but it's better than browsing thousands of columns in
> CSVs trying to find these pitfalls.
>
> Thank you all for the assist!
>
>
> On Tue, Apr 21, 2015 at 1:04 PM, David G. Johnston
> <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:
>
>     On Tue, Apr 21, 2015 at 9:48 AM, Shawn Gennaria
>     <sgennaria2@gmail.com <mailto:sgennaria2@gmail.com>>wrote:
>
>         Glad to hear it's here just in time.  I am using 9.4, though, so
>         I wish I could figure out why it's returning NULL when I use
>         it.  And the error message string doesn't contain any column
>         name to parse in my output.
>
>         On Tue, Apr 21, 2015 at 12:46 PM, Alvaro Herrera
>         <alvherre@2ndquadrant.com <mailto:alvherre@2ndquadrant.com>> wrote:
>
>             Shawn Gennaria wrote:
>             > OK, I'm looking at
>
>www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES
>
<http://www.postgresql.org/docs/9.4/interactive/plpgsql-control-structures.html#PLPGSQL-EXCEPTION-DIAGNOSTICS-VALUES>
>             > which I completely missed before.  This sounds like my answer, but it's not
>             > returning anything when I try to extract the COLUMN_NAME.
>
>             As far as I recall, COLUMN_NAME is new in 9.4.  If you're
>             trying with an
>             earlier version, you can't get that info other than by
>             parsing the error
>             message string.
>
>
>     ​Adrian provided an example of trying to place a valid date into a
>     non-date column.  The data itself was correct but the place it is
>     being stored to is invalid and can be reported explicitly.
>
>     Shawn provided an example of trying to create an integer using an
>     invalid value.  Type input errors are not column specific and so the
>     error - which is basically a parse error - does not provide column
>     information.
>
>     There is likely some more experimenting here, and maybe room to
>     attached optional contextual markers to make better error messages,
>     but fundamentally these are two different kinds of errors which
>     cannot be generalized over to the extent of saying "errors should
>     provide column name information"...because what would you do for {
>     SELECT 'a'::int }?
>
>     David J.
>
>


-- 
Adrian Klaver
adrian.klaver@aklaver.com