FWIW, Greenplum has a similar construct (but which also logs the errors in the db) where data type errors are skipped as long as the number of errors don't exceed a reject limit. If the reject limit is reached then the COPY fails: > > LOG ERRORS [ SEGMENT REJECT LIMIT <count> [ ROWS | PERCENT ]] > IIRC the gist of this was to catch then the user copies the wrong input data or plain has a broken file. Rather than finding out after copying n rows which are likely to be garbage the process can be restarted.
I think this is a matter for discussion. The same question is: "Where to log errors to separate files or to the system logfile?".
IMO it's better for users to log short-detailed error message to system logfile and not output errors to the terminal.
This version of the patch has a compiler error in the error message:
Yes, corrected it. Changed "ignored_errors" to int64 because "processed" (used for counting copy rows) is int64.
I felt just logging "Error: %ld" would make people wonder the meaning of the %ld. Logging something like ""Error: %ld data type errors were found" might be clearer.
Thanks. For more clearance change the message to: "Errors were found: %".