Re: New Copy Formats - avro/orc/parquet - Mailing list pgsql-general

From Adrian Klaver
Subject Re: New Copy Formats - avro/orc/parquet
Date
Msg-id 69a4a33f-c63d-a614-8687-c746ae015fb3@aklaver.com
Whole thread Raw
In response to Re: New Copy Formats - avro/orc/parquet  (Nicolas Paris <niparisco@gmail.com>)
Responses Re: New Copy Formats - avro/orc/parquet
List pgsql-general
On 02/11/2018 12:57 PM, Nicolas Paris wrote:
> Le 11 févr. 2018 à 21:53, Andres Freund écrivait :
>> On 2018-02-11 21:41:26 +0100, Nicolas Paris wrote:
>>> I have also the storage and network transfers overhead in mind:
>>> All those new formats are compressed; this is not true for current
>>> postgres BINARY format and obviously text based format. By experience,
>>> the binary format is 10 to 30% larger than the text one. On the
>>> contrary, an ORC file can be up to 10 times smaller than a text base
>>> format.
>>
>> That seems largely irrelevant when arguing about using PROGRAM though,
>> right?
>>
> 
> Indeed those storage and network transfers are only considered versus
> CSV/BINARY format. No link with PROGRAM aspect.
> 

Just wondering what your time frame is on this? Asking because this 
would be considered a new feature and so would need to be added to a 
major release of Postgres. Currently work is going on for Postgres 
version 11 to be released(just a guess) late Fall 2018/early Winter 
2019. The CommitFest(https://commitfest.postgresql.org/) for this 
release is currently approximately 3/4 of the way through. Not sure that 
new code could make it in at this point. This means it would be bumped 
to version 12 for 2019/2020.


-- 
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Andres Freund
Date:
Subject: Re: New Copy Formats - avro/orc/parquet
Next
From: Nicolas Paris
Date:
Subject: Re: New Copy Formats - avro/orc/parquet