Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row - Mailing list pgsql-hackers
From | Kirill Reshke |
---|---|
Subject | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row |
Date | |
Msg-id | CALdSSPgLVBHNr97pYgmVfi+YdH4bqwk9YVvbuvhJ4sCr7eEVOA@mail.gmail.com Whole thread Raw |
In response to | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row (jian he <jian.universality@gmail.com>) |
List | pgsql-hackers |
On Tue, 19 Nov 2024, 13:52 jian he, <jian.universality@gmail.com> wrote: > > On Sat, Nov 16, 2024 at 5:55 PM Kirill Reshke <reshkekirill@gmail.com> wrote: > > > > I am attaching my v8 for reference. > > > > in your v8. > > <varlistentry> > <term><literal>REJECT_LIMIT</literal></term> > <listitem> > <para> > Specifies the maximum number of errors tolerated while converting a > column's input value to its data type, when <literal>ON_ERROR</literal> is > set to <literal>ignore</literal>. > If the input contains more erroneous rows than the specified > value, the <command>COPY</command> > command fails, even with <literal>ON_ERROR</literal> set to > <literal>ignore</literal>. > </para> > </listitem> > </varlistentry> > > then above description not meet with following example, (i think) > > create table t(a int not null); > COPY t FROM STDIN WITH (on_error set_to_null, reject_limit 2); > Enter data to be copied followed by a newline. > End with a backslash and a period on a line by itself, or an EOF signal. > >> a > >> \. > ERROR: null value in column "a" of relation "t" violates not-null constraint > DETAIL: Failing row contains (null). > CONTEXT: COPY t, line 1, column a: "a" Sure, my v8 does not helps with column level NOT NULL constraint (or other constraint) > Overall, I think > making the domain not-null align with column level not-null would be a > good thing. While this looks sane, it's actually a separate topic. Even on current HEAD we have domain not-null vs column level not-null unalignment. consider this example ``` reshke=# create table ftt2 (i int not null); CREATE TABLE reshke=# copy ftt2 from stdin with (reject_limit 1000, on_error ignore); Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal. >> \N >> \. ERROR: null value in column "i" of relation "ftt2" violates not-null constraint DETAIL: Failing row contains (null). CONTEXT: COPY ftt2, line 1: "\N" reshke=# create domain dd as int not null ; CREATE DOMAIN reshke=# create table ftt3(i dd); CREATE TABLE reshke=# copy ftt3 from stdin with (reject_limit 1000, on_error ignore); Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself, or an EOF signal. >> \N >> \. NOTICE: 1 row was skipped due to data type incompatibility COPY 0 reshke=# ``` So, if we want this, we need to start another thread and deal with REJECT_LIMIT + on_error ignore first. The ExecConstraints function is the source of the error in scenario 1. Therefore, we require something like "ExecConstraintsSafe" to accommodate the aforementioned. That is a significant change. Not sure it will be accepted by the community. > > <para> > Specifies how to behave when encountering an error converting a column's > input value into its data type. > An <replaceable class="parameter">error_action</replaceable> value of > <literal>stop</literal> means fail the command, > <literal>ignore</literal> means discard the input row and > continue with the next one, and > <literal>set_to_null</literal> means replace columns containing > erroneous input values with <literal>null</literal> and move to the > next row. > > "and move to the next row" is wrong? > I think it should be " and move to the next field". Yes, "and move to the next field" is correct.
pgsql-hackers by date: