Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date
Msg-id 0DE0602F-CC12-40ED-B259-3AB91FB02C3B@yesql.se
Whole thread Raw
In response to Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Damir Belyalov <dam.bel07@gmail.com>)
Responses Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (torikoshia <torikoshia@oss.nttdata.com>)
List pgsql-hackers
> On 28 Feb 2023, at 15:28, Damir Belyalov <dam.bel07@gmail.com> wrote:

> Tested patch on all cases: CIM_SINGLE, CIM_MULTI, CIM_MULTI_CONDITION. As expected it works.
> Also added a description to copy.sgml and made a review on patch.
>
> I added 'ignored_errors' integer parameter that should be output after the option is finished.
> All errors were added to the system logfile with full detailed context. Maybe it's better to log only error message.

FWIW, Greenplum has a similar construct (but which also logs the errors in the
db) where data type errors are skipped as long as the number of errors don't
exceed a reject limit.  If the reject limit is reached then the COPY fails:

    LOG ERRORS [ SEGMENT REJECT LIMIT <count> [ ROWS | PERCENT ]]

IIRC the gist of this was to catch then the user copies the wrong input data or
plain has a broken file.  Rather than finding out after copying n rows which
are likely to be garbage the process can be restarted.

This version of the patch has a compiler error in the error message:

copyfrom.c: In function ‘CopyFrom’:
copyfrom.c:1008:29: error: format ‘%ld’ expects argument of type ‘long int’, but argument 2 has type ‘uint64’ {aka
‘longlong unsigned int’} [-Werror=format=] 
1008 | ereport(WARNING, errmsg("Errors: %ld", cstate->ignored_errors));
     |                          ^~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
     |                                              |
     |                                              uint64 {aka long long unsigned int}


On that note though, it seems to me that this error message leaves a bit to be
desired with regards to the level of detail.

--
Daniel Gustafsson




pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Next
From: Ashutosh Bapat
Date:
Subject: Re: wrong results due to qual pushdown