Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row - Mailing list pgsql-hackers
From | jian he |
---|---|
Subject | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row |
Date | |
Msg-id | CACJufxF6_YwAboiCaVYLRtNpO4kGbXqkXzH_7W=pUvrNXK8WuQ@mail.gmail.com Whole thread Raw |
In response to | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
On Sat, Apr 5, 2025 at 5:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Apr 4, 2025 at 4:55 AM jian he <jian.universality@gmail.com> wrote: > > > > On Tue, Mar 25, 2025 at 2:31 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > 2) Here in error we say column c1 violates not-null constraint and in > > > the context we show column c2, should the context also display c2 > > > column: > > > postgres=# create table t3(c1 int not null, c2 int, check (c1 > 10)); > > > CREATE TABLE > > > postgres=# COPY t3 FROM STDIN WITH (on_error set_to_null); > > > Enter data to be copied followed by a newline. > > > End with a backslash and a period on a line by itself, or an EOF signal. > > > >> a b > > > >> \. > > > ERROR: null value in column "c1" of relation "t3" violates not-null constraint > > > DETAIL: Failing row contains (null, null). > > > CONTEXT: COPY t3, line 1, column c2: "b" > > > > > > > It took me a while to figure out why. > > with the attached, now the error message becomes: > > > > ERROR: null value in column "c1" of relation "t3" violates not-null constraint > > DETAIL: Failing row contains (null, null). > > CONTEXT: COPY t3, line 1: "a,b" > > > > while at it, > > (on_error set_to_null, log_verbosity verbose) > > error message CONTEXT will only emit out relation name, > > this aligns with (on_error ignore, log_verbosity verbose). > > > > one of the message out example: > > +NOTICE: column "b" was set to null due to data type incompatibility at line 2 > > +CONTEXT: COPY t_on_error_null > > > > > > > > > 3) typo becomen should be become: > > > null will becomen reserved to non-reserved > > fixed. > > > > > 4) There is a whitespace error while applying patch > > > Applying: COPY (on_error set_to_null) > > > .git/rebase-apply/patch:39: trailing whitespace. > > > a <literal>NOTICE</literal> message indicating the number of rows > > > warning: 1 line adds whitespace errors. > > fixed. > > I've reviewed the v15 patch and here are some comments: > > How about renaming the new option value to 'set_null"? The 'to' in the > value name seems redundant to me. > > --- > + COPY_ON_ERROR_NULL, /* set error field to null */ > > I think it's better to rename COPY_ON_ERROR_SET_TO_NULL (or > COPY_ON_ERROR_SET_NULL if we change the option value name) for > consistency with the value name. > > --- > + else if (cstate->opts.on_error == COPY_ON_ERROR_NULL) > + ereport(NOTICE, > + errmsg_plural("invalid values > in %" PRIu64 " row was replaced with null", > + > "invalid values in %" PRIu64 " rows were replaced with null", > + > cstate->num_errors, > + > cstate->num_errors)); > > How about adding "due to data type incompatibility" at the end of the message? > > --- > + ereport(NOTICE, > + errmsg("column > \"%s\" was set to null due to data type incompatibility at line %" > PRIu64 "", > + > cstate->cur_attname, > + > cstate->cur_lineno)); > > Similar to the IGNORE case, we can show the data in question in the message. > > --- > + else > + ereport(ERROR, > + > errcode(ERRCODE_NOT_NULL_VIOLATION), > + errmsg("domain %s does > not allow null values", format_type_be(typioparams[m])), > + errdatatype(typioparams[m])); > > If domain data type is the sole case where not to accept NULL, can we > check it beforehand to avoid calling the second > InputFunctionCallSafe() for non-domain data types? Also, if we want to > end up with an error when setting NULL to a domain type with NOT NULL, > I think we don't need to try to handle a soft error by passing > econtext to InputFunctionCallSafe(). > please check attached, hope i have addressed all the points you've mentioned. > If domain data type is the sole case where not to accept NULL, can we > check it beforehand to avoid calling the second > InputFunctionCallSafe() for non-domain data types? I doubt it. we have InputFunctionCallSafe(FmgrInfo *flinfo, char *str, Oid typioparam, int32 typmod, fmNodePtr escontext, Datum *result) { LOCAL_FCINFO(fcinfo, 3); if (str == NULL && flinfo->fn_strict) { *result = (Datum) 0; /* just return null result */ return true; } } Most of the non-domain type input functions are strict. see query result: select proname, pt.typname, proisstrict,pt.typtype from pg_type pt join pg_proc pp on pp.oid = pt.typinput where pt.typtype <> 'd' and pt.typtype <> 'p' and proisstrict is false; so the second InputFunctionCallSafe will be faster for non-domain types. before CopyFromTextLikeOneRow we don't know if this type is domain_with_constraint or not. Beforehand, we can conditionally call DomainHasConstraints to find out. but DomainHasConstraints is expensive, which may carry extra performance issues for non-domain types. but the second InputFunctionCallSafe call will not be a big issue for domain_with_constraint, because the first time domain_in call already cached related structs.
Attachment
pgsql-hackers by date: