Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) - Mailing list pgsql-hackers

From torikoshia
Subject Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Date
Msg-id 84996d38ce68a8b9c0aa751f6fdcfa9a@oss.nttdata.com
Whole thread Raw
In response to Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)  (jian he <jian.universality@gmail.com>)
Responses Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
List pgsql-hackers
On 2024-01-18 23:59, jian he wrote:
> Hi.
> patch refactored based on "on_error {stop|ignore}"
> doc changes:
> 
> --- a/doc/src/sgml/ref/copy.sgml
> +++ b/doc/src/sgml/ref/copy.sgml
> @@ -43,7 +43,7 @@ COPY { <replaceable
> class="parameter">table_name</replaceable> [ ( <replaceable
>      FORCE_QUOTE { ( <replaceable
> class="parameter">column_name</replaceable> [, ...] ) | * }
>      FORCE_NOT_NULL { ( <replaceable
> class="parameter">column_name</replaceable> [, ...] ) | * }
>      FORCE_NULL { ( <replaceable
> class="parameter">column_name</replaceable> [, ...] ) | * }
> -    SAVE_ERROR_TO '<replaceable 
> class="parameter">location</replaceable>'
> +    ON_ERROR '<replaceable 
> class="parameter">error_action</replaceable>'
>      ENCODING '<replaceable 
> class="parameter">encoding_name</replaceable>'
>  </synopsis>
>   </refsynopsisdiv>
> @@ -375,20 +375,20 @@ COPY { <replaceable
> class="parameter">table_name</replaceable> [ ( <replaceable
>     </varlistentry>
> 
>     <varlistentry>
> -    <term><literal>SAVE_ERROR_TO</literal></term>
> +    <term><literal>ON_ERROR</literal></term>
>      <listitem>
>       <para>
> -      Specifies to save error information to <replaceable 
> class="parameter">
> -      location</replaceable> when there is malformed data in the 
> input.
> -      Currently, only <literal>error</literal> (default) and
> <literal>none</literal>
> +      Specifies which <replaceable class="parameter">
> +      error_action</replaceable> to perform when there is malformed
> data in the input.
> +      Currently, only <literal>stop</literal> (default) and
> <literal>ignore</literal>
>        values are supported.
> -      If the <literal>error</literal> value is specified,
> +      If the <literal>stop</literal> value is specified,
>        <command>COPY</command> stops operation at the first error.
> -      If the <literal>none</literal> value is specified,
> +      If the <literal>ignore</literal> value is specified,
>        <command>COPY</command> skips malformed data and continues 
> copying data.
>        The option is allowed only in <command>COPY FROM</command>.
> -      The <literal>none</literal> value is allowed only when
> -      not using <literal>binary</literal> format.
> +      Only <literal>stop</literal> value is allowed only when
> +      using <literal>binary</literal> format.
>       </para>

Thanks for making the patch!

Here are some comments:

> -      The <literal>none</literal> value is allowed only when
> -      not using <literal>binary</literal> format.
> +      Only <literal>stop</literal> value is allowed only when
> +      using <literal>binary</literal> format.

The second 'only' may be unnecessary.

> -                       /* If SAVE_ERROR_TO is specified, skip rows 
> with soft errors */
> +                       /* If ON_ERROR is specified with IGNORE, skip 
> rows with soft errors */

This is correct now, but considering future works which add other 
options like "file 'copy.log'" and
"table 'copy_log'", it may be better not to limit the case to 'IGNORE'.
How about something like this?

   If ON_ERROR is specified and the value is not STOP, skip rows with 
soft errors

> -COPY x from stdin (format BINARY, save_error_to none);
> -COPY x to stdin (save_error_to none);
> +COPY x from stdin (format BINARY, ON_ERROR ignore);
> +COPY x from stdin (ON_ERROR unsupported);
>  COPY x to stdin (format TEXT, force_quote(a));
>  COPY x from stdin (format CSV, force_quote(a));

In the existing test for copy2.sql, the COPY options are written in 
lower case(e.g. 'format') and option value(e.g. 'BINARY') are written in 
upper case.
It would be more consistent to align them.


-- 
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: index prefetching
Next
From: Julien Rouhaud
Date:
Subject: Re: System username in pg_stat_activity