On Fri, Mar 1, 2024 at 10:22 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> > Nice catch. When COPY_ON_ERROR_STOP is specified, we use ereport's
> > soft error mechanism. An assertion seems a good choice to validate the
> > state is what we expect. Done that way.
>
> Hmm. I am not really on board with this patch, that would generate
> one NOTICE message each time a row is incompatible in the soft error
> mode. If you have a couple of billion rows to bulk-load into the
> backend and even 0.01% of them are corrupted, you could finish with a
> more than 100k log entries, and all systems should be careful about
> the log quantity generated, especially if we use the syslogger which
> could become easily a bottleneck.
Hm. I was having some concerns about it as mentioned upthread. But,
thanks a lot for illustrating it.
> The existing ON_ERROR controls what to do on error. I think that we'd
> better control the amount of information reported with a completely
> separate option, an option even different than where to redirect
> errors (if required, which would be either the logs, the client, a
> pipe, a combination of these or even all of them).
How about an extra option to error_action ignore-with-verbose which is
similar to ignore but when specified emits one NOTICE per malformed
row? With this, one can say COPY x FROM stdin (ON_ERROR
ignore-with-verbose);.
Alternatively, we can think of adding a new option verbose altogether
which can be used for not only this but for reporting some other COPY
related info/errors etc. With this, one can say COPY x FROM stdin
(VERBOSE on, ON_ERROR ignore);.
There's also another way of having a separate GUC, but -100 from me
for it. Because, it not only increases the total number of GUCs by 1,
but also might set a wrong precedent to have a new GUC for controlling
command level outputs.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com