On 01/03/2014 04:20 PM, Tom Lane wrote:
> I think Florian has a good point there, and the reason is this: what
> you are talking about will be of exactly zero use to applications that
> want to see the results of one query before launching the next. Which
> eliminates a whole lot of apps. I suspect that almost the *only*
> common use case in which a stream of queries can be launched without
> feedback is going to be bulk data loading. It's not clear at all
> that pipelining the PQexec code path is the way to better performance
> for that --- why not use COPY, instead?
The data I encounter has to be distributed across multiple tables.
Switching between the COPY TO commands would again need client-side
buffering and heuristics for sizing these buffers. Lengths of runs vary
a lot in my case.
I also want to use binary mode as a far as possible to avoid the integer
conversion overhead, but some columns use custom enum types and are
better transferred in text mode.
Some INSERTs happen via stored procedures, to implement de-duplication.
These issues could be addressed by using temporary staging tables.
However, when I did that in the past, this caused pg_shdepend bloat.
Carefully reusing them when possible might avoid that. Again, due to
the variance in lengths of runs, the staging tables are not always
beneficial.
I understand that pipelining introduces complexity. But solving the
issues described above is no picnic, either.
--
Florian Weimer / Red Hat Product Security Team