Re: New Copy Formats - avro/orc/parquet - Mailing list pgsql-general

From Nicolas Paris
Subject Re: New Copy Formats - avro/orc/parquet
Date
Msg-id 20180211200012.2agrfocyaf42td5v@gmail.com
Whole thread Raw
In response to Re: New Copy Formats - avro/orc/parquet  (Andres Freund <andres@anarazel.de>)
Responses Re: New Copy Formats - avro/orc/parquet  (Andres Freund <andres@anarazel.de>)
List pgsql-general
> > That is true, but the question is how significant the overhead is. If
> > it's 50% then reducing it would make perfect sense. If it's 1% then no
> > one if going to be bothered by it.
> 
> I think it's pretty clear that it's going to be way way much more than
> 1%. 

Good news but not sure to anderstand why.

> It's trivial to construct cases where input parsing / output
> formatting takes the majority of the time. 

Binary -> ORC
        ^
    |
   PROGRAM parsing/output formating on the fly

> And a lot of that you're going to be able to avoid with binary formats.

Still the above diagram shows both parsing/formating step, isn't it ?





pgsql-general by date:

Previous
From: Andres Freund
Date:
Subject: Re: New Copy Formats - avro/orc/parquet
Next
From: Andres Freund
Date:
Subject: Re: New Copy Formats - avro/orc/parquet