Home > mailing lists

Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers

From	Damir Belyalov
Subject	Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date	March 7, 2023 11:35:32
Msg-id	CALH1LguAEsoTYJTCsXNB-7z2Hu9UGEpsXA4kj0FOTmoP=6Wp3Q@mail.gmail.com Whole thread Raw
In response to	Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) (torikoshia <torikoshia@oss.nttdata.com>)
Responses	Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
List	pgsql-hackers

Tree view

FWIW, Greenplum has a similar construct (but which also logs the errors
in the
db) where data type errors are skipped as long as the number of errors
don't
exceed a reject limit. If the reject limit is reached then the COPY
fails:
>
> LOG ERRORS [ SEGMENT REJECT LIMIT <count> [ ROWS | PERCENT ]]
>
IIRC the gist of this was to catch then the user copies the wrong input
data or
plain has a broken file. Rather than finding out after copying n rows
which
are likely to be garbage the process can be restarted.

I think this is a matter for discussion. The same question is: "Where to log errors to separate files or to the system logfile?".

IMO it's better for users to log short-detailed error message to system logfile and not output errors to the terminal.

This version of the patch has a compiler error in the error message:

Yes, corrected it. Changed "ignored_errors" to int64 because "processed" (used for counting copy rows) is int64.

I felt just logging "Error: %ld" would make people wonder the meaning of
the %ld. Logging something like ""Error: %ld data type errors were
found" might be clearer.

Thanks. For more clearance change the message to: "Errors were found: %".

Regards, Damir Belyalov

Postgres Professional

Attachment

v3-0001-Add-COPY-option-IGNORE_DATATYPE_ERRORS.patch

pgsql-hackers by date:

From: Daniel Gustafsson
Date: 07 March 2023, 11:26:41
Subject: Re: Raising the SCRAM iteration count

From: David Rowley
Date: 07 March 2023, 11:58:03
Subject: Re: using memoize in in paralel query decreases performance

Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers

Attachment

Previous

Next