Re: Make COPY extendable in order to support Parquet and other formats - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Make COPY extendable in order to support Parquet and other formats
Date
Msg-id 20220622234908.jkmc6qg352dsh5x5@alap3.anarazel.de
Whole thread Raw
In response to Re: Make COPY extendable in order to support Parquet and other formats  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: Make COPY extendable in order to support Parquet and other formats
List pgsql-hackers
Hi,

On 2022-06-22 16:59:16 +0530, Ashutosh Bapat wrote:
> On Tue, Jun 21, 2022 at 3:26 PM Aleksander Alekseev
> <aleksander@timescale.com> wrote:
> 
> >
> > In other words, personally I'm unaware of use cases when somebody
> > needs a complete read/write FDW or TableAM implementation for formats
> > like Parquet, ORC, etc. Also to my knowledge they are not particularly
> > optimized for this.
> >
> 
> IIUC, you want extensibility in FORMAT argument to COPY command
> https://www.postgresql.org/docs/current/sql-copy.html. Where the
> format is pluggable. That seems useful.

Agreed.

But I think it needs quite a bit of care. Just plugging in a bunch of per-row
(or worse, per field) switches to COPYs input / output parsing will make the
code even harder to read and even slower.

I suspect that we'd first need a patch to refactor the existing copy code a
good bit to clean things up. After that it hopefully will be possible to plug
in a new format without being too intrusive.

I know little about parquet - can it support FROM STDIN efficiently?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PoC] Let libpq reject unexpected authentication requests
Next
From: Justin Pryzby
Date:
Subject: Re: pg_upgrade (12->14) fails on aggregate