Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Junwang Zhao
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id CAEG8a3JuShA6g19Nt_Ejk15BrNA6PmeCbK7p81izZi71muGq3g@mail.gmail.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Sutou Kouhei <kou@clear-code.com>)
Responses Re: Make COPY format extendable: Extract COPY TO format implementations
List pgsql-hackers
On Fri, Dec 15, 2023 at 8:53 AM Sutou Kouhei <kou@clear-code.com> wrote:
>
> Hi,
>
> In <CAD21AoCZv3cVU+NxR2s9J_dWvjrS350GFFr2vMgCH8wWxQ5hTQ@mail.gmail.com>
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 15 Dec 2023 05:19:43 +0900,
>   Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > To avoid collisions, extensions can be created in a
> > different schema than public.
>
> Thanks. I didn't notice it.
>
> > And note that built-in format copy handler doesn't need to
> > declare its handler function.
>
> Right. I know it.
>
> > Adding a prefix or suffix would be one option but to give extensions
> > more flexibility, another option would be to support format = 'custom'
> > and add the "handler" option to specify a copy handler function name
> > to call. For example, COPY ... FROM ... WITH (FORMAT = 'custom',
> > HANDLER = 'arrow_copy_handler').
>
I like the prefix/suffix idea, easy to implement. *custom* is not a FORMAT,
and user has to know the name of the specific handler names, not
intuitive.

> Interesting. If we use this option, users can choose an COPY
> FORMAT implementation they like from multiple
> implementations. For example, a developer may implement a
> COPY FROM FORMAT = 'json' handler with PostgreSQL's JSON
> related API and another developer may implement a handler
> with simdjson[1] which is a fast JSON parser. Users can
> choose whichever they like.
Not sure about this, why not move Json copy handler to contrib
as an example for others, any extensions share the same format
function name and just install one? No bound would implement
another CSV or TEXT copy handler IMHO.
>
> But specifying HANDLER = '...' explicitly is a bit
> inconvenient. Because only one handler will be installed in
> most use cases. In the case, users don't need to choose one
> handler.
>
> If we choose this option, it may be better that we also
> provide a mechanism that can work without HANDLER. Searching
> a function by name like tablesample method does is an option.
>
>
> [1]: https://github.com/simdjson/simdjson
>
>
> Thanks,
> --
> kou



--
Regards
Junwang Zhao



pgsql-hackers by date:

Previous
From: Sutou Kouhei
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Next
From: Amit Kapila
Date:
Subject: Re: Improve eviction algorithm in ReorderBuffer