Home > mailing lists

Re: Cached Query Plans (was: global prepared statements) - Mailing list pgsql-hackers

From	Csaba Nagy
Subject	Re: Cached Query Plans (was: global prepared statements)
Date	April 14, 2008 12:30:50
Msg-id	1208187040.8259.316.camel@PCD12478 Whole thread Raw
In response to	Re: Cached Query Plans (was: global prepared statements) (PFC <lists@peufeu.com>)
List	pgsql-hackers

Tree view

On Mon, 2008-04-14 at 17:08 +0200, PFC wrote:
>     Those "Decision" nodes could potentially lead to lots of decisions (ahem).
>     What if you have 10 conditions in the Where, plus some joined ones ? That  
> would make lots of possibilities...

Yes, that's true, but most of them are likely not relevant for the end
result. In any real life query there are a few parameters which are
really important for what plan you should choose... the key here is that
you should spend more time on finding the possibilities for a cached
plan than you do for a one shot query. 

In principle one-shot planning should be the default and caching should
be something the user has to chose deliberately. I would really like a
special command to plan and cache a query without actually executing it,
possibly having a parameter how hard to try... for e.g. you could expend
the extra cycles to eliminate all redundancies from boolean expressions,
in lists, to get the parse tree in a canonical format - all things which
can make planning easier. All these lose in one-shot queries, but once
you cache you can really do a lot of smarts which were no-no before...

>     Consider several types of queries :
> 
>     - The small, quick query which returns one or a few rows : in this case,  
> planning overhead is large relative to execution time, but I would venture  
> to guess that the plans always end up being the same.

Consider a 'select a where b like $1' -> the parameter $1 will
considerably affect the query plan. A query can't go much simpler...

>     - The query that takes a while : in this case, planning overhead is nil  
> compared to execution time, better replan every time with the params.

I guess these queries are not the ones targeted by this feature. In fact
for these queries it really doesn't matter if you cache or not, except:
if you know you're gonna cache, you'll expend more effort planning
right, and that could still matter for a query which runs long. Note
that if you don't cache, planning harder will lose in the long run, only
once you cache you can afford to plan harder...

>     - The complex query that still executes fast because it doesn't process a  
> lot of rows and postgres finds a good plan (for instance, a well optimized  
> search query). Those would benefit from reducing the planning overhead,  
> but those also typically end up having many different plans depending on  
> the search parameters. Besides, those queries are likely to be dynamically  
> generated. So, would it be worth it to add all those features just to  
> optimize those ? I don't know...

We have here dynamically generated queries which are specifically
chunked to be executed in small increments so none of the queries runs
too long (they would block vacuuming vital tables otherwise). Those
chunks would greatly benefit from properly planned and cached plans... 

A real smart system would store canonical plan fragments as response to
(also canonicalized) parse tree fragments, and then assemble the plan
out of those fragments, but that would be indeed really complex (to
design, the resulting code might be simpler than one thinks) ;-)

Cheers,
Csaba.

pgsql-hackers by date:

From: Kris Jurka
Date: 14 April 2008, 12:23:55
Subject: Re: [Pljava-dev] stack depth limit exceeded - patch possible?

From: Bruce Momjian
Date: 14 April 2008, 13:29:39
Subject: Lessons from commit fest

Re: Cached Query Plans (was: global prepared statements) - Mailing list pgsql-hackers

Previous

Next