Re: Add reject_limit option to file_fdw - Mailing list pgsql-hackers

From torikoshia
Subject Re: Add reject_limit option to file_fdw
Date
Msg-id 153fd507364f195f036695c3bb48cf44@oss.nttdata.com
Whole thread Raw
In response to Re: Add reject_limit option to file_fdw  (Kirill Reshke <reshkekirill@gmail.com>)
List pgsql-hackers
On 2024-11-12 15:23, Kirill Reshke wrote:

Thanks for your review!

> On Tue, 12 Nov 2024 at 06:17, torikoshia <torikoshia@oss.nttdata.com> 
> wrote:
>> 
>> On 2024-11-12 01:49, Fujii Masao wrote:
>> > On 2024/11/11 21:45, torikoshia wrote:
>> >>> Thanks for adding the comment. It clearly states that REJECT_LIMIT
>> >>> can be
>> >>> a single-quoted string. However, it might also be helpful to mention
>> >>> that
>> >>> it can be provided as an int64 in the COPY command option. How about
>> >>> updating it like this?
>> >>>
>> >>> ------------------------------------
>> >>> REJECT_LIMIT can be specified in two ways: as an int64 for the COPY
>> >>> command
>> >>> option or as a single-quoted string for the foreign table option
>> >>> using
>> >>> file_fdw. Therefore this function needs to handle both formats.
>> >>> ------------------------------------
>> >>
>> >> Thanks! it seems better.
>> >>
>> >>
>> >> Attached v3 patch.
>> >
>> > Thanks for updating the patch! It looks like you forgot to attach it,
>> > though.
>> 
>> Oops, thanks for pointing it out.
>> Here it is.
>> 
>> 
>> --
>> Regards,
>> 
>> --
>> Atsushi Torikoshi
>> Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.
> 
> Hi!
> 
> A little question from me.
> 
> This is your doc for reject_limit:
> 
> +  <varlistentry>
> +   <term><literal>reject_limit</literal></term>
> +
> +   <listitem>
> +    <para>
> +     Specifies the maximum number of errors tolerated while
> converting a column's
> +     input value to its data type, the same as 
> <command>COPY</command>'s
> +    <literal>REJECT_LIMIT</literal> option.
> +    </para>
> +   </listitem>
> +  </varlistentry>
> +
> 
> This is how it looks on the current HEAD for copy.
> 
> <varlistentry>
>     <term><literal>REJECT_LIMIT</literal></term>
>     <listitem>
>      <para>
>       Specifies the maximum number of errors tolerated while converting 
> a
>       column's input value to its data type, when 
> <literal>ON_ERROR</literal> is
>       set to <literal>ignore</literal>.
>       If the input causes more errors than the specified value, the
> <command>COPY</command>
>       command fails, even with <literal>ON_ERROR</literal> set to
> <literal>ignore</literal>.
>       This clause must be used with
> <literal>ON_ERROR</literal>=<literal>ignore</literal>
>       and <replaceable class="parameter">maxerror</replaceable> must
> be positive <type>bigint</type>.
>       If not specified, 
> <literal>ON_ERROR</literal>=<literal>ignore</literal>
>       allows an unlimited number of errors, meaning 
> <command>COPY</command> will
>       skip all erroneous data.
>      </para>
>     </listitem>
>    </varlistentry>
> 
> There is a difference. Should we add REJECT_LIMIT vs ON_ERROR
> clarification for file_fdw too? or maybe we put a reference for COPY
> doc.

As you may know, some options for file_fdw are the same as those for 
COPY. While the manual provides detailed descriptions of these options 
in the COPY section, the explanations in file_fdw are shorter and less 
detailed.

I intended to follow this approach, but do you think it should be 
changed?

-- 
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.



pgsql-hackers by date:

Previous
From: Jim Vanns
Date:
Subject: BitmapOr node not used in plan for ANY/IN but is for sequence of ORs ...
Next
From: Andrei Lepikhov
Date:
Subject: Re: Some dead code in get_param_path_clause_serials()