Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Sutou Kouhei
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id 20240202.163319.975394400703526633.kou@clear-code.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Make COPY format extendable: Extract COPY TO format implementations
List pgsql-hackers
Hi,

In <ZbyJ60Fd7CHt4m0i@paquier.xyz>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 2 Feb 2024 15:21:31 +0900,
  Michael Paquier <michael@paquier.xyz> wrote:

> I have done a review of v10, see v11 attached which is still WIP, with
> the patches for COPY TO and COPY FROM merged together.  Note that I'm
> thinking to merge them into a single commit.

OK. I don't have a strong opinion for commit unit.

> @@ -74,11 +75,11 @@ typedef struct CopyFormatOptions
>      bool        convert_selectively;    /* do selective binary conversion? */
>      CopyOnErrorChoice on_error; /* what to do when error happened */
>      List       *convert_select; /* list of column names (can be NIL) */
> +    const        CopyToRoutine *to_routine;    /* callback routines for COPY TO */
>  } CopyFormatOptions;
> 
> Adding the routines to the structure for the format options is in my
> opinion incorrect.  The elements of this structure are first processed
> in the option deparsing path, and then we need to use the options to
> guess which routines we need.

This was discussed with Sawada-san a bit before. [1][2]

[1]
https://www.postgresql.org/message-id/flat/CAD21AoBmNiWwrspuedgAPgbAqsn7e7NoZYF6gNnYBf%2BgXEk9Mg%40mail.gmail.com#bfd19262d261c67058fdb8d64e6a723c
[2]
https://www.postgresql.org/message-id/flat/20240130.144531.1257430878438173740.kou%40clear-code.com#fc55392d77f400fc74e42686fe7e348a

I kept the routines in CopyFormatOptions for custom option
processing. But I should have not cared about it in this
patch set because this patch set doesn't include custom
option processing.

So I'm OK that we move the routines to
Copy{From,To}StateData.

>         This also led to a strange separation with
> ProcessCopyOptionFormatFrom() and ProcessCopyOptionFormatTo() to fit
> the hole in-between.

They also for custom option processing. We don't need to
care about them in this patch set too.

> copyapi.h needs more documentation, like what is expected for
> extension developers when using these, what are the arguments, etc.  I
> have added what I had in mind for now.

Thanks! I'm not good at writing documentation in English...

> +typedef char *(*PostpareColumnValue) (CopyFromState cstate, char *string, int m);
> 
> CopyReadAttributes and PostpareColumnValue are also callbacks specific
> to text and csv, except that they are used within the per-row
> callbacks.  The same can be said about CopyAttributeOutHeaderFunction.
> It seems to me that it would be less confusing to store pointers to
> them in the routine structures, where the final picture involves not
> having multiple layers of APIs like CopyToCSVStart,
> CopyAttributeOutTextValue, etc.  These *have* to be documented
> properly in copyapi.h, and this is much easier now that cstate stores
> the routine pointers.  That would also make simpler function stacks.
> Note that I have not changed that in the v11 attached.
> 
> This business with the extra callbacks required for csv and text is my
> main point of contention, but I'd be OK once the model of the APIs is
> more linear, with everything in Copy{From,To}State.  The changes would
> be rather simple, and I'd be OK to put my hands on it.  Just,
> Sutou-san, would you agree with my last point about these extra
> callbacks?

I'm OK with the approach. But how about adding the extra
callbacks to Copy{From,To}StateData not
Copy{From,To}Routines like CopyToStateData::data_dest_cb and
CopyFromStateData::data_source_cb? They are only needed for
"text" and "csv". So we don't need to add them to
Copy{From,To}Routines to keep required callback minimum.

What is the better next action for us? Do you want to
complete the WIP v11 patch set by yourself (and commit it)?
Or should I take over it?


Thanks,
-- 
kou



pgsql-hackers by date:

Previous
From: Junwang Zhao
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Next
From: Richard Guo
Date:
Subject: Re: An improvement on parallel DISTINCT