Home > mailing lists

Re: [HACKERS] Cached plans and statement generalization - Mailing list pgsql-hackers

From	Konstantin Knizhnik
Subject	Re: [HACKERS] Cached plans and statement generalization
Date	April 25, 2017 18:11:09
Msg-id	f0bfcdd6-dba3-e9a8-2108-146e1f880839@postgrespro.ru Whole thread Raw
In response to	Re: [HACKERS] Cached plans and statement generalization (Andres Freund <andres@anarazel.de>)
Responses	Re: [HACKERS] Cached plans and statement generalization Re: [HACKERS] Cached plans and statement generalization Re: [HACKERS] Cached plans and statement generalization
List	pgsql-hackers

Tree view

On 24.04.2017 21:43, Andres Freund wrote:

Hi,

On 2017-04-24 11:46:02 +0300, Konstantin Knizhnik wrote:

So what I am thinking now is implicit query caching. If the same query with
different literal values is repeated many times, then we can try to
generalize this query and replace it with prepared query with
parameters.

That's not actuall all that easy:
- You pretty much do parse analysis to be able to do an accurate match. How much overhead is parse analysis vs. planning in your cases?
- The invalidation infrastructure for this, if not tied to to fully parse-analyzed statements, is going to be hell.
- Migrating to parameters can actually cause significant slowdowns, not nice if that happens implicitly.

Well, first of all I want to share results I already get: pgbench with default parameters, scale 10 and one connection:

protocol	TPS
simple	3492
extended	2927
prepared	6865
simple + autoprepare	6844

So autoprepare is as efficient as explicit prepare and can increase performance almost two times.

My current implementation is replacing with parameters only string literals in the query, i.e. select * from T where x='123'; -> select * from T where x=$1;
It greatly simplifies matching of parameters - it is just necessary to locate '\'' character and then correctly handle pairs of quotes.
Handling of integer and real literals is really challenged task.
One source of problems is negation: it is not so easy to correctly understand whether minus should be treated as part of literal or as operator:
(-1), (1-1), (1-1)-1
Another problem is caused by using integer literals in context where parameters can not be used, for example "order by 1".

Fully correct substitution can be done by first performing parsing the query, then transform parse tree, replacing literal nodes with parameter nodes and finally deparse tree into generalized query. postgres_fdw already contains such deparse code. It can be moved to postgres core and reused for autoprepare (and may be somewhere else).
But in this case overhead will be much higher.
I still think that query parsing time is significantly smaller than time needed for building and optimizing query execution plan.
But it should be measured if community will be interested in such approach.

There is obvious question: how I managed to get this pgbench results if currently only substitution of string literals is supported and queries constructed by pgbench don't contain string literals? I just made small patch in pgbench replaceVariable method wrapping value's representation in quotes. It has almost no impact on performance (3482 TPS vs. 3492 TPS),
but allows autoprepare to deal with pgbench queries.

I attached my patch to this mail. It is just first version of the patch (based on REL9_6_STABLE branch) just to illustrate proposed approach.
I will be glad to receive any comments and if such optimization is considered to be useful, I will continue work on this patch.

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

autoprepare.patch

pgsql-hackers by date:

From: Andrew Dunstan
Date: 25 April 2017, 18:09:39
Subject: Re: [HACKERS] TAP tests - installcheck vs check

From: Fujii Masao
Date: 25 April 2017, 18:13:08
Subject: Re: [HACKERS] Separation walsender & normal backends

Re: [HACKERS] Cached plans and statement generalization - Mailing list pgsql-hackers

Attachment

Previous

Next