Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Sutou Kouhei
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id 20231215.095305.1361997086905276509.kou@clear-code.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Make COPY format extendable: Extract COPY TO format implementations
Re: Make COPY format extendable: Extract COPY TO format implementations
List pgsql-hackers
Hi,

In <CAD21AoCZv3cVU+NxR2s9J_dWvjrS350GFFr2vMgCH8wWxQ5hTQ@mail.gmail.com>
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 15 Dec 2023 05:19:43 +0900,
  Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> To avoid collisions, extensions can be created in a
> different schema than public.

Thanks. I didn't notice it.

> And note that built-in format copy handler doesn't need to
> declare its handler function.

Right. I know it.

> Adding a prefix or suffix would be one option but to give extensions
> more flexibility, another option would be to support format = 'custom'
> and add the "handler" option to specify a copy handler function name
> to call. For example, COPY ... FROM ... WITH (FORMAT = 'custom',
> HANDLER = 'arrow_copy_handler').

Interesting. If we use this option, users can choose an COPY
FORMAT implementation they like from multiple
implementations. For example, a developer may implement a
COPY FROM FORMAT = 'json' handler with PostgreSQL's JSON
related API and another developer may implement a handler
with simdjson[1] which is a fast JSON parser. Users can
choose whichever they like.

But specifying HANDLER = '...' explicitly is a bit
inconvenient. Because only one handler will be installed in
most use cases. In the case, users don't need to choose one
handler.

If we choose this option, it may be better that we also
provide a mechanism that can work without HANDLER. Searching
a function by name like tablesample method does is an option.


[1]: https://github.com/simdjson/simdjson


Thanks,
-- 
kou



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Memory consumed by paths during partitionwise join planning
Next
From: John Naylor
Date:
Subject: Re: Change GUC hashtable to use simplehash?