Re: Cached Query Plans (was: global prepared statements) - Mailing list pgsql-hackers

From PFC
Subject Re: Cached Query Plans (was: global prepared statements)
Date
Msg-id op.t9ksc5mjcigqcu@apollo13.peufeu.com
Whole thread Raw
In response to Re: Cached Query Plans (was: global prepared statements)  ("Dawid Kuroczko" <qnex42@gmail.com>)
Responses Re: Cached Query Plans (was: global prepared statements)  (Csaba Nagy <nagy@ecircle-ag.com>)
List pgsql-hackers
> Why limit ourselves with Oracle?  How all major proprietary RDBMSs do it.

Thanks for the links. Very interesting.
The DB2 document especially mentions an important point : in order to make  
their planner/optimizer smarter, they had to make it slower, hence it  
became crucial to cache the plans.
Contrast this with MySQL where using prepared statements gains nothing :  
the "optimizer" does so little work that it actually doesn't matter.

So, basically, Orcale :
- Parses the query every time (identifies tables, permissions etc) (soft  
parse)
- From that parsed query it looks up a cached plan (the lookup key could  
then be different depending on the schema etc)
- If not, it must plan the query (hard parse).
Also the Oracle doc mentions that the soft parsing should be avoided by  
using prepared statements in the application (ie Parse once and Bind lots  
of times)
So, Oracle will redo the parsing + permissions check each time, unless  
prepared statements are used, in which case it's direct execution.

And DB2 :
Er, the document is not very clear about what it actually does, but the  
stats look nice ;)

> I liked your global prepared statements idea much better. Named the
> statements is no problem: DB frontends do that for you anyway
> sometimes.

Hm. The "global statements" and the cache would complement each other  
actually. Why not.

When the user wants to name the statements, he can do so (and perhaps  
control who can execute what, etc, like with stored procs)
Permission checking overhead will be there at each execution.
Should the plan be cached locally ? (RAM consumption times N bakends...)
Cached per user once permissions have been checked ? (avoids the overhead  
of rechecking permissions)
What about the search path ?
(I'd force the global statements to use the default search path no matter  
what, being explicit is better than "why does it stop working ?")

Can the application or the database library name the statements ?
I'm not so sure. This could work for compiled languages (what about when  
you run several applications ? or several versions of the same application  
? do we need a uniqueness of statement names from all developers all over  
the world ?) Solution : make each application use a different user name,  
and global prepared statements only visible to the user that created them,  
perhaps. This conflicts with some desirable features, though. It needs  
more thinking.

What about non-compiled languages ? It will not be possible to generate a  
list of statements beforehands... And queries are also constructed  
dynamically by frameworks such as Rails, which makes naming them  
impossible, but caching the plans would work well.

So, some situations would benefit from a plan cache,

> Frankly, I think you're better off storing them in a table. Shared
> memory is a limited resource and you cannot change how much you've
I'd say that unless you have a perverse application that will try all the  
permutations of column names just to make sure the query is different  
every time, how many different queries would you want to cache ?...  
probably less than 1000... so it wouldn't take more than a couple  
megabytes...

> allocated after the server has started. It does mean you'll have to
> serialise/deserialise them, but this will be cheaper than replanning,
> right?
What would be the overhead of a catalog lookup to get a cached plan for a  
statement that returns 1 row ? Would the catalog cache make it fast enough  
?And what about deserialization ?...

> I am not too sure that plans and statistical counters should be stored
> together...
Not sure either.

> Probably plans should go in one place, and statistics should go to the
> stats collector (I know he's not quite ready for this ;)).
That's the problem...

> Hm, a limit on how much memory can be used for plans
> (query_plan_cache_size GUC?), and a LRU/LFU expiration
> of old plans?
Now it gets hairy ;)Yes memory size should be limited. But how to make a LRU cleaner which  
doesn't create lots of contention ?... Luckily, with a hash having a fixed  
number of buckets, it is easier (clean a bucket every N seconds for  
instance).

> Perhaps a GUC for controlling query cache should heve three values:
>  none -- don't cache any statement
>  smart -- use heuristics for deciding whether to cache it
>  all -- force caching all queries -- for uncommon/statistical/testing  
> purposes.
I would not volunteer to write that heuristic ;)Although there would be a very simple solution : if time to parse >
some 
 
percentage of time to execute then cache.The hairiness is in the plan dependence (or independence) on parameter  
values, ideally we only want to cache plans that would be good for all  
parameter values, only the user knows that precisely. Although it could be  
possible to examine the column histograms...

>>  (like mysql, /* flags */ SELECT blah )
>
> I don't like the hint flags.  They tend to haunt later on (when the  
> database gets smarter, but application forces it to be dumb).  I would  
> say a GUC.
I don't like them either... needs a better solution like a flag in  
PQexecParams, but this would cause lots of trouble, so it's not really  
possible...




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgwin32_safestat weirdness
Next
From: "Brendan Jurd"
Date:
Subject: Re: Patch to add objetct size on "\d+" verbose output