Re: Make COPY extendable in order to support Parquet and other formats - Mailing list pgsql-hackers

From Aleksander Alekseev
Subject Re: Make COPY extendable in order to support Parquet and other formats
Date
Msg-id CAJ7c6TNFD84KK62xrGP-PDwPM7OESM8=TTv8TjsZpbOuNMnwGA@mail.gmail.com
Whole thread Raw
In response to Re: Make COPY extendable in order to support Parquet and other formats  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List pgsql-hackers
Hi Ashutosh,

> IIUC, you want extensibility in FORMAT argument to COPY command
> https://www.postgresql.org/docs/current/sql-copy.html. Where the
> format is pluggable. That seems useful.
> Another option is to dump the data in csv format but use external
> utility to convert csv to parquet or whatever other format is. I
> understand that that's not going to be as efficient as dumping
> directly in the desired format.

Exactly. However, to clarify, I suspect this may be a bit more
involved than simply extending the FORMAT arguments.

This change per se will not be extremely useful. Currently nothing
prevents an extension author to iterate over a table using
heap_open(), heap_getnext(), etc API and dump its content in any
format. The user will have to write "dump_table(foo, filename)"
instead of "COPY ..." but that's not a big deal.

The problem is that every new extension has to re-invent things like
figuring out the schema, the validation of the data, etc. If we could
do this in the core so that an extension author has to implement only
the minimal format-dependent list of callbacks that would be really
great. In order to make the interface practical though one will have
to implement a practical extension as well, for instance, a Parquet
one.

This being said, if it turns out that for some reason this is not
realistic to deliver, ending up with simply extending this part of the
syntax a bit should be fine too.

-- 
Best regards,
Aleksander Alekseev



pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Replica Identity check of partition table on subscriber
Next
From: "Drouvot, Bertrand"
Date:
Subject: SYSTEM_USER reserved word implementation