Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers

From Junwang Zhao
Subject Re: Make COPY format extendable: Extract COPY TO format implementations
Date
Msg-id CAEG8a3+D3B7zKOyq4LRgV8_Fx6n7-rZ4hN+wzD7UgU22+8BQzA@mail.gmail.com
Whole thread Raw
In response to Re: Make COPY format extendable: Extract COPY TO format implementations  ("Daniel Verite" <daniel@manitou-mail.org>)
List pgsql-hackers
On Wed, Dec 6, 2023 at 8:32 PM Daniel Verite <daniel@manitou-mail.org> wrote:
>
>         Sutou Kouhei wrote:
>
> > * 2022-04: Apache Arrow [2]
> > * 2018-02: Apache Avro, Apache Parquet and Apache ORC [3]
> >
> > (FYI: I want to add support for Apache Arrow.)
> >
> > There were discussions how to add support for more formats. [3][4]
> > In these discussions, we got a consensus about making COPY
> > format extendable.
>
>
> These formats seem all column-oriented whereas COPY is row-oriented
> at the protocol level [1].
> With regard to the procotol, how would it work to support these formats?
>

They have kind of *RowGroup* concepts, a bunch of rows goes to a RowBatch
and the data of the same column goes together.

I think they should fit the COPY semantics and there are some  FDW out there for
these modern formats, like [1]. If we support COPY to deal with the
format, it will
be easier to interact with them(without creating
server/usermapping/foreign table).

[1]: https://github.com/adjust/parquet_fdw

>
> [1] https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-COPY
>
>
> Best regards,
> --
> Daniel Vérité
> https://postgresql.verite.pro/
> Twitter: @DanielVerite
>
>


--
Regards
Junwang Zhao



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences, take 2
Next
From: Tom Lane
Date:
Subject: Re: Is WAL_DEBUG related code still relevant today?