Re: Extend COPY FROM with HEADER to skip multiple lines - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Extend COPY FROM with HEADER to skip multiple lines
Date
Msg-id 1a6f670f-7f04-4f4b-a166-cc3c952d056f@oss.nttdata.com
Whole thread Raw
In response to Extend COPY FROM with HEADER to skip multiple lines  (Shinya Kato <shinya11.kato@gmail.com>)
List pgsql-hackers

On 2025/07/02 14:39, Shinya Kato wrote:
> On Mon, Jun 30, 2025 at 4:24 PM Yugo Nagata <nagata@sraoss.co.jp> wrote:
>>
>>
>>>> I have a few minor comment on the patch.
>>>>
>>>> +                           if (ival < 0)
>>>> +                                   ereport(ERROR,
>>>> +                                                   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>>>> +                                                    errmsg("%s requires a Boolean value, a non-negative "
>>>> +                                                                   "integer, or the string \"match\"",
>>>> +                                                                   def->defname)));
>>>>
>>>>      ereport(ERROR,
>>>>                      (errcode(ERRCODE_SYNTAX_ERROR),
>>>> -                    errmsg("%s requires a Boolean value or \"match\"",
>>>> +                    errmsg("%s requires a Boolean value, a non-negative integer, "
>>>> +                                   "or the string \"match\"",
>>>>                                      def->defname)));
>>>>
>>>> These two pieces of code raise the same error, but with different error codes:
>>>> one returns ERRCODE_INVALID_PARAMETER_VALUE, while the other returns ERRCODE_SYNTAX_ERROR.
>>>>
>>>> I believe the former is more appropriate, although the existing code uses the
>>>> latter. In any case, the error codes should be consistent.
>>>
>>> I'm not sure there's an actual rule like "the error code must match
>>> if the error message is the same." But if using the same message with
>>> a different error code is confusing, I'm fine with changing
>>> the earlier message. For example:
>>>
>>>                                                        (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>>> -                                                      errmsg("%s requires a Boolean value, a non-negative "
>>> -                                                                     "integer, or the string \"match\"",
>>> +                                                      errmsg("a negative integer value cannot be "
>>> +                                                                     "specified for %s",
>>>                                                                        def->defname)));
>>
>> Fair point. There might not be any explicit rule.
>> However, I feel it somewhat confusing.
>>
>> After looking into the code, it seems that ERRCODE_SYNTAX_ERROR tends to be used
>> for invalid keywords or parameter types, while ERRCODE_INVALID_PARAMETER_VALUE
>> is typically used for values that are out of the acceptable range.
>>
>> So, the proposed change seems reasonable to me.
> 
> Thank you for the review. The change looks good to me, too.  A new
> patch is attached.

Thanks for updating the patch!

> 
>> Regarding the documentation, how about explicitly stating that when MATCH is specified, only
>> the first line is skipped? While this may seem obvious, it’s worth clarifying, as the semantics
>> of the HEADER option have become a bit more complex with this change.
> 
> Agreed.  I have updated the documentation as follows:
> 
> +      lines are discarded.  If the option is set to <literal>MATCH</literal>,
> +      the number and names of the columns in the header line must exactly
> +      match those of the table and, in order, after which the header line is
> +      discarded;  otherwise an error is raised.  The <literal>MATCH</literal>

How about making the wording a bit clearer? For example:

     If set to MATCH, the first line is discarded, and it must contain column names that
     exactly match the table's columns, in both number and order; otherwise, an error is raised.


Also, the phrase "if this option is set to..." is repeated three times in the current text.
For the second and third instances, we could simplify it to just "if set to...".

Regards,

-- 
Fujii Masao
NTT DATA Japan Corporation




pgsql-hackers by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: Reduce "Var IS [NOT] NULL" quals during constant folding
Next
From: Daniel Gustafsson
Date:
Subject: Re: Explicitly enable meson features in CI