Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers

From zhihuifan1213@163.com
Subject Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date
Msg-id 878r77mtlt.fsf@163.com
Whole thread Raw
In response to Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Daniel Gustafsson <daniel@yesql.se> writes:
>>> On 8 Nov 2023, at 19:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I think an actually usable feature of this sort would involve
>>> copying all the failed lines to some alternate output medium,
>>> perhaps a second table with a TEXT column to receive the original
>>> data line.  (Or maybe an array of text that could receive the
>>> broken-down field values?)  Maybe we could dump the message info,
>>> line number, field name etc into additional columns.
>
>> I agree that the errors should be easily visible to the user in some way.  The
>> feature is for sure interesting, especially in data warehouse type jobs where
>> dirty data is often ingested.
>
> I agree it's interesting, but we need to get it right the first time.
>
> Here is a very straw-man-level sketch of what I think might work.
> The option to COPY FROM looks something like
>
>     ERRORS TO other_table_name (item [, item [, ...]])
>
> where the "items" are keywords identifying the information item
> we will insert into each successive column of the target table.
> This design allows the user to decide which items are of use
> to them.  I envision items like

While I'm pretty happy with the overall design, which is 'ERRORS to
other_table_name' specially. I'm a bit confused why do we need to
write the codes for (item [, item [, ...]]), not only because it
requires more coding but also requires user to make more decisions.
will it be anything wrong to make all of them as default? 

> LINENO    bigint        COPY line number, counting from 1
> LINE    text        raw text of line (after encoding conversion)
> FIELDS    text[]        separated, de-escaped string fields (the data
>             that was or would be fed to input functions)
> FIELD    text        name of troublesome field, if field-specific
> MESSAGE    text        error message text
> DETAIL    text        error message detail, if any
> SQLSTATE text        error SQLSTATE code
>


-- 
Best Regards
Andy Fan




pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: A recent message added to pg_upgade
Next
From: torikoshia
Date:
Subject: Re: Add new option 'all' to pg_stat_reset_shared()