Small fix on COPY ON_ERROR document - Mailing list pgsql-hackers

From David G. Johnston
Subject Small fix on COPY ON_ERROR document
Date
Msg-id CAKFQuwaOJXF__H_QpSieOF7SWhCDHbsw5GuoaffgZC663f6kgQ@mail.gmail.com
Whole thread Raw
In response to Re: Small fix on COPY ON_ERROR document  (Yugo NAGATA <nagata@sraoss.co.jp>)
Responses Re: Small fix on COPY ON_ERROR document
List pgsql-hackers


On Sunday, January 28, 2024, Yugo NAGATA <nagata@sraoss.co.jp> wrote:
On Fri, 26 Jan 2024 08:04:45 -0700
"David G. Johnston" <david.g.johnston@gmail.com> wrote:

> On Fri, Jan 26, 2024 at 2:30 AM Yugo NAGATA <nagata@sraoss.co.jp> wrote:
>
> > On Fri, 26 Jan 2024 00:00:57 -0700
> > "David G. Johnston" <david.g.johnston@gmail.com> wrote:
> >
> > > I will need to make this tweak and probably a couple others to my own
> > > suggestions in 12 hours or so.
> > >
> >
> >
> And here is my v2.
>
> Notably I choose to introduce the verbiage "soft error" and then define in
> the ON_ERROR clause the specific soft error that matters here - "invalid
> input syntax".

I am not sure we should use "soft error" without any explanation
because it seems to me that the meaning of words is unclear for users. 

Agreed. It needs to be added to the glossary.

 

Also, I think "invalid input syntax" is a bit ambiguous. For example,
COPY FROM raises an error when the number of input column does not match
to the table schema, but this error is not ignored by ON_ERROR while
this seems to fall into the category of "invalid input syntax".


It is literally the error text that appears if one were not to ignore it.  It isn’t a category of errors.  But I’m open to ideas here.  But being explicit with what on actually sees in the system seemed preferable to inventing new classification terms not otherwise used.
 

So, keeping consistency with the existing description, we can say:

"Specifies which how to behave when encountering an error due to
 column values unacceptable to the input function of each attribute's
 data type."

Yeah, I was considering something along those lines as an option as well.  But I’d rather add that wording to the glossary.



Currently, ON_ERROR doesn't support other soft errors, so it can explain
it more simply without introducing the new concept, "soft error" to users.


Good point.  Seems we should define what user-facing errors are ignored anywhere in the system and if we aren’t consistently leveraging these in all areas/commands make the necessary qualifications in those specific places.

 
> I also note the log message behavior when ignore mode is chosen.  I haven't
> confirmed that it is accurate but that is readily tweaked if approved of.
>

+      An <literal>INFO</literal> level context message containing the ignored row count is
+      emitted at the end of the <command>COPY FROM</command> if at least one row was discarded.


The log level is NOTICE not INFO.

Makes sense, I hadn’t experimented. 


I think "left in a deleted state" is also unclear for users because this
explains the internal state but not how looks from user's view.How about
leaving the explanation "These rows will not be visible or accessible" in
the existing statement?

Just visible then, I don’t like an “or” there and as tuples at least they are accessible to the system, in vacuum especially.  But I expected the user to understand “as if you deleted it” as their operational concept more readily than visible.  I think this will be read by people who haven’t read MVCC to fully understand what visible means but know enough to run vacuum to clean up updated and deleted data as a rule.

David J.
 

pgsql-hackers by date:

Previous
From: torikoshia
Date:
Subject: Re: Small fix on COPY ON_ERROR document
Next
From: Masahiko Sawada
Date:
Subject: Re: Remove unused fields in ReorderBufferTupleBuf