Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date
Msg-id CAD21AoCO_gkhJ4nr6v1_tMDAqGOrRk1tPJgdowvGDNveS11eEA@mail.gmail.com
Whole thread Raw
In response to Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Mon, Jan 15, 2024 at 8:21 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
>
> On Sun, Jan 14, 2024 at 10:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Thank you for updating the patch. Here are two comments:
> >
> > ---
> > +   if (cstate->opts.save_error_to != COPY_SAVE_ERROR_TO_UNSPECIFIED &&
> > +       cstate->num_errors > 0)
> > +       ereport(WARNING,
> > +               errmsg("%zd rows were skipped due to data type incompatibility",
> > +                      cstate->num_errors));
> > +
> >     /* Done, clean up */
> >     error_context_stack = errcallback.previous;
> >
> > If a malformed input is not the last data, the context message seems odd:
> >
> > postgres(1:1769258)=# create table test (a int);
> > CREATE TABLE
> > postgres(1:1769258)=# copy test from stdin (save_error_to none);
> > Enter data to be copied followed by a newline.
> > End with a backslash and a period on a line by itself, or an EOF signal.
> > >> a
> > >> 1
> > >>
> > 2024-01-15 05:05:53.980 JST [1769258] WARNING:  1 rows were skipped
> > due to data type incompatibility
> > 2024-01-15 05:05:53.980 JST [1769258] CONTEXT:  COPY test, line 3: ""
> > COPY 1
> >
> > I think it's better to report the WARNING after resetting the
> > error_context_stack. Or is a WARNING really appropriate here? The
> > v15-0001-Make-COPY-FROM-more-error-tolerant.patch[1] uses NOTICE but
> > the v1-0001-Add-new-COPY-option-SAVE_ERROR_TO.patch[2] changes it to
> > WARNING without explanation.
>
> Thank you for noticing this.  I think NOTICE is more appropriate here.
> There is nothing to "worry" about: the user asked to ignore the errors
> and we did.  And yes, it doesn't make sense to use the last line as
> the context.  Fixed.
>
> > ---
> > +-- test missing data: should fail
> > +COPY check_ign_err FROM STDIN WITH (save_error_to none);
> > +1  {1}
> > +\.
> >
> > We might want to cover the extra data cases too.
>
> Agreed, the relevant test is added.

Thank you for updating the patch. I have one minor point:

+       if (cstate->opts.save_error_to != COPY_SAVE_ERROR_TO_UNSPECIFIED &&
+               cstate->num_errors > 0)
+               ereport(NOTICE,
+                               errmsg("%zd rows were skipped due to
data type incompatibility",
+                                          cstate->num_errors));
+

We can use errmsg_plural() instead.

I have a question about the option values; do you think we need to
have another value of SAVE_ERROR_TO option to explicitly specify the
current default behavior, i.e. not accept any error? With the v4
patch, the user needs to omit SAVE_ERROR_TO option to accept errors
during COPY FROM. If we change the default behavior in the future,
many users will be affected and probably end up changing their
applications to keep the current default behavior.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: POC: GROUP BY optimization
Next
From: Pavel Stehule
Date:
Subject: Re: Oom on temp (un-analyzed table caused by JIT) V16.1