Re: Slowness of extended protocol - Mailing list pgsql-hackers

From Vladimir Sitnikov
Subject Re: Slowness of extended protocol
Date
Msg-id CAB=Je-FP1rM14fx20CZ62GP0jS9aGJiuS9GRVLrCvTPuE3UPjQ@mail.gmail.com
Whole thread Raw
In response to Re: Slowness of extended protocol  (Shay Rojansky <roji@roji.org>)
Responses Re: Slowness of extended protocol  (Shay Rojansky <roji@roji.org>)
List pgsql-hackers
Shay>But here's the more important general point. We're driver developers, not application developers. I don't really know what performance is "just fine" for each of my users, and what is not worth optimizing further. Users may follow best practices, or they may not for various reasons.

Of course we cannot benchmark all the existing applications, however we should at lest try to use "close to production" benchmarks.

Let me recap: even "select 1" shows clear advantage of reusing server-prepared statements.
My machine shows the following results for "select 1 pgbench":
simple: 38K ops/sec (~26us/op)
extended: 32K ops/sec (~31us/op)
prepared: 47K ops/sec (~21us/op)

Note: reusing server-prepared statements shaves 10us (out of 31us), while "brand new ParseBindExecDeallocate" message would not able to perform better than 26us/op (that is 5 us worse than the prepared one). So it makes much more sense investing time in "server-prepared statement reuse" at the client side and "improving Bind/Exec performance" at the backend side.

For more complex queries the gap (prepared vs simple) would be much bigger since parse/validate/plan for a complex query is much harder operation than the one for "select 1"

Note: I do not mean "look, prepared always win". I mean: "if your environment does not allow reuse of prepared statements for some reason, you lose huge amount of time on re-parsing the queries, and it is worth fixing that obvious issue first".

Shay>I don't see how reusing SQL text affects security in any way.

Reusing SQL text makes application more secure as "build SQL on the fly" is prone to SQL injection security issues.
So reusing SQL text makes application more secure and it enables server-prepared statements that improve performance considerably. It is a win-win.

Shay>a new feature in the Npgsql dev branch which allows prepared statements to be persisted across open/close on pooled connections

Do you have some benchmark results for "reusing server-prepared statements across open/close on pooled"? I would expect that feature to be a great win.

Once again, I'd like to focus on real-life cases, not artificial ones.

For example, the major part of my performance fixes to pgjdbc were made as a part of improving my java application that was suffering from performance issues when talking to PostgreSQL.
For instance, there were queries that took 20ms to plan and 0.2ms to execute (the query is like where id=? but the SQL text was more involved).
As transparent server-side statement was implemented at pgjdbc side, it shaved those 20ms by eliminating Parse messages on the hot path.

In other words, it was not just "lets optimize pgjdbc". It was driven by the need to optimize the client application, and the profiling results were pointing to pgjdbc issues.

Shay>Again, in a world where prepared statements aren't persisted across connections (e.g. pgbouncer)

pgbouncer does not properly support named statements, and that is pbgouncer's issue.

The response from pgbouncer team is "all the protocol bits are there, it is just implementation from pgbouncer that is missing".

By the way: I do not use pgbouncer, thus there's no much interest for me to invest time in fixing pgbouncer's issues.


Shay>Any scenario where you open a relatively short-lived connection and execute something once is problematic - imagine a simple web service which needs to insert a single record into a table.

I would assume the application does not use random string for a table name (and columns/aliases), thus it would result in typical SQL text reuse, thus it should trigger "server-side statement prepare" logic. In other way, that kind of application does not need the "new ParseBindExecDeallocate message we are talking about".

In other words, if an application is using "select name from objects where id=$1" kind of queries, the driver should be using extended protocol (Bind/Exec) behind the scenes if it does aim to get high performance.

Vladimir

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: No longer possible to query catalogs for index capabilities?
Next
From: Alvaro Herrera
Date:
Subject: Re: No longer possible to query catalogs for index capabilities?