Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row - Mailing list pgsql-hackers

From Jim Jones
Subject Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Date
Msg-id 90dc6e9d-9348-485a-b27c-7b1637f06c2e@uni-muenster.de
Whole thread Raw
In response to Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row  (jian he <jian.universality@gmail.com>)
Responses Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
List pgsql-hackers
Hi Jian

On 07.03.25 04:48, jian he wrote:
> hi.
> rebase only.


I revisited this patch today. It applies and builds cleanly, and it
works as expected.

Some tests and minor comments:

====

1) WARNING might be a better fit than NOTICE here.

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error set_to_null, format csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  erroneous values in 3 rows were replaced with null
COPY 5
postgres=# SELECT * FROM t;
  x   |  y   |  z   
------+------+------
    1 | NULL | NULL
    2 |    1 | NULL
    3 |    2 | NULL
    4 | NULL | NULL
 NULL | NULL | NULL
(5 rows)


postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error set_to_null, format
csv, log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  column "y" was set to null due to data type incompatibility at
line 1
NOTICE:  column "y" was set to null due to data type incompatibility at
line 4
NOTICE:  column "x" was set to null due to data type incompatibility at
line 5
NOTICE:  column "y" was set to null due to data type incompatibility at
line 5
NOTICE:  erroneous values in 3 rows were replaced with null
COPY 5
postgres=# SELECT * FROM t;
  x   |  y   |  z   
------+------+------
    1 | NULL | NULL
    2 |    1 | NULL
    3 |    2 | NULL
    4 | NULL | NULL
 NULL | NULL | NULL
(5 rows)


I would still leave the extra messages from "log_verbosity verbose" as
NOTICE though. What do you think?

====

2) Inconsistent terminology. Invalid values in "on_error set_to_null"
mode are names as "erroneous", but as "invalid" in "on_error stop" mode.
I don't want to get into the semantics of erroneous or invalid, but
sticking to one terminology would IMHO look better.

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error stop, format csv,
log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
ERROR:  invalid input syntax for type integer: "a"
CONTEXT:  COPY t, line 1, column y: "a"
postgres=# SELECT * FROM t;
 x | y | z
---+---+---
(0 rows)

====

3) same as in 1)

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error ignore, format csv,
log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  skipping row due to data type incompatibility at line 1 for
column "y": "a"
NOTICE:  skipping row due to data type incompatibility at line 4 for
column "y": "b"
NOTICE:  skipping row due to data type incompatibility at line 5 for
column "x": "a"
NOTICE:  3 rows were skipped due to data type incompatibility
COPY 2
postgres=# SELECT * FROM t;
 x | y |  z   
---+---+------
 2 | 1 | NULL
 3 | 2 | NULL
(2 rows)====

====

"on_error ignore" works well with "reject_limit #"

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error ignore, format csv,
log_verbosity verbose, reject_limit 1);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  skipping row due to data type incompatibility at line 1 for
column "y": "a"
NOTICE:  skipping row due to data type incompatibility at line 4 for
column "y": "b"
ERROR:  skipped more than REJECT_LIMIT (1) rows due to data type
incompatibility
CONTEXT:  COPY t, line 4, column y: "b"
postgres=# SELECT * FROM t;
 x | y | z
---+---+---
(0 rows)

best regards, Jim



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Add an option to skip loading missing publication to avoid logical replication failure
Next
From: Daniil Davydov
Date:
Subject: Re: Accessing an invalid pointer in BufferManagerRelation structure