> On 8 Nov 2023, at 19:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think an actually usable feature of this sort would involve
> copying all the failed lines to some alternate output medium,
> perhaps a second table with a TEXT column to receive the original
> data line. (Or maybe an array of text that could receive the
> broken-down field values?) Maybe we could dump the message info,
> line number, field name etc into additional columns.
I agree that the errors should be easily visible to the user in some way. The
feature is for sure interesting, especially in data warehouse type jobs where
dirty data is often ingested.
As a data point, Greenplum has this feature with additional SQL syntax to
control it:
COPY .. LOG ERRORS SEGMENT REJECT LIMIT xyz ROWS;
LOG ERRORS instructs the database to log the faulty rows and SEGMENT REJECT
LIMIT xyz ROWS sets the limit of how many rows can be faulty before the
operation errors out. I'm not at all advocating that we should mimic this,
just wanted to add a reference to postgres derivative where this has been
implemented.
--
Daniel Gustafsson