Re: Cached Query Plans (was: global prepared statements) - Mailing list pgsql-hackers
From | Csaba Nagy |
---|---|
Subject | Re: Cached Query Plans (was: global prepared statements) |
Date | |
Msg-id | 1208187040.8259.316.camel@PCD12478 Whole thread Raw |
In response to | Re: Cached Query Plans (was: global prepared statements) (PFC <lists@peufeu.com>) |
List | pgsql-hackers |
On Mon, 2008-04-14 at 17:08 +0200, PFC wrote: > Those "Decision" nodes could potentially lead to lots of decisions (ahem). > What if you have 10 conditions in the Where, plus some joined ones ? That > would make lots of possibilities... Yes, that's true, but most of them are likely not relevant for the end result. In any real life query there are a few parameters which are really important for what plan you should choose... the key here is that you should spend more time on finding the possibilities for a cached plan than you do for a one shot query. In principle one-shot planning should be the default and caching should be something the user has to chose deliberately. I would really like a special command to plan and cache a query without actually executing it, possibly having a parameter how hard to try... for e.g. you could expend the extra cycles to eliminate all redundancies from boolean expressions, in lists, to get the parse tree in a canonical format - all things which can make planning easier. All these lose in one-shot queries, but once you cache you can really do a lot of smarts which were no-no before... > Consider several types of queries : > > - The small, quick query which returns one or a few rows : in this case, > planning overhead is large relative to execution time, but I would venture > to guess that the plans always end up being the same. Consider a 'select a where b like $1' -> the parameter $1 will considerably affect the query plan. A query can't go much simpler... > - The query that takes a while : in this case, planning overhead is nil > compared to execution time, better replan every time with the params. I guess these queries are not the ones targeted by this feature. In fact for these queries it really doesn't matter if you cache or not, except: if you know you're gonna cache, you'll expend more effort planning right, and that could still matter for a query which runs long. Note that if you don't cache, planning harder will lose in the long run, only once you cache you can afford to plan harder... > - The complex query that still executes fast because it doesn't process a > lot of rows and postgres finds a good plan (for instance, a well optimized > search query). Those would benefit from reducing the planning overhead, > but those also typically end up having many different plans depending on > the search parameters. Besides, those queries are likely to be dynamically > generated. So, would it be worth it to add all those features just to > optimize those ? I don't know... We have here dynamically generated queries which are specifically chunked to be executed in small increments so none of the queries runs too long (they would block vacuuming vital tables otherwise). Those chunks would greatly benefit from properly planned and cached plans... A real smart system would store canonical plan fragments as response to (also canonicalized) parse tree fragments, and then assemble the plan out of those fragments, but that would be indeed really complex (to design, the resulting code might be simpler than one thinks) ;-) Cheers, Csaba.
pgsql-hackers by date: