Home > mailing lists

Insert values() per-statement overhead - Mailing list pgsql-hackers

From	Vladimir Sitnikov
Subject	Insert values() per-statement overhead
Date	January 15, 2016 13:17:22
Msg-id	CAB=Je-EDZwVH7FgpDOa0ad8a7a6D=1QoYmvW2kQqSAML2gf+Uw@mail.gmail.com Whole thread Raw
Responses	Re: Insert values() per-statement overhead
List	pgsql-hackers

Tree view

Hi,

There is a finding that insert(x) values(y);insert(x) values(z);  is
2-4 times slower than insert(..) values(y),(z);
see [1], [2].

In other words, there is a significant per-statement overhead even
though server-prepared statements are properly used.
The issue is reproducible in 9.5rc1.

Is it something that was discussed previously? (I was not able to find
that in archives)
Is it something that can be improved? (e.g. "insert node state"
caching across execute calls, improve performance of "INSERT %u %u"
generation, etc)

Even though I understand there will always be _some_ per-statement
overhead, such a hight overhead plays against common case of using
ORMs.
End-users are just stuck with insert(...) values(...);

1) Java's standard way of batching statements is
"PreparedStatement#addBatch()". Unfortunately, backend protocol does
not support statement batching.
One does not simply teach Hibernate/EclipseLink,etc etc to use
PostgreSQL's-specific COPY.
Note: I am not talking about network roundtrips here. I'm just
highlighting that there is no way to execute "bind bind bind
executebatch" sequence at the protocol level.

2) One might consider "transparent rewrite of insert()  batches into a
single insert() values(),(),() statement" at JDBC driver level, but it
is hard to get right as there is no easy way to parse a query. It is
really expected that every PostgreSQL connector would implement SQL
parser & insert rewriter?

3) Transparent rewrites (including "rewrite inserts to COPY") would
fail to provide "number of modified rows" for each row. Error
semantics is different as well.

4) COPY does not support UPSERT, does it?

My profiler (Instruments in Mac OS) shows that significant time is
spent in standard_ExecutorStart: see [3]
In fact, the time spent in standard_ExecutorStart even exceeds the
time spent in standard_ExecutorRun.

[1]: http://www.postgresql.org/message-id/55130DC8.2070508@redhat.com
[2]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171780054
[3]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171908974

Vladimir Sitnikov

pgsql-hackers by date:

From: Simon Riggs
Date: 15 January 2016, 13:08:36
Subject: Re: Stream consistent snapshot via a logical decoding plugin as a series of INSERTs

From: Andres Freund
Date: 15 January 2016, 13:35:03
Subject: Re: Insert values() per-statement overhead

Insert values() per-statement overhead - Mailing list pgsql-hackers

Previous

Next