Re: Avoiding bad prepared-statement plans. - Mailing list pgsql-hackers

From Pierre C
Subject Re: Avoiding bad prepared-statement plans.
Date
Msg-id op.u8bvvnwveorkce@localhost
Whole thread Raw
In response to Re: Avoiding bad prepared-statement plans.  (Greg Stark <gsstark@mit.edu>)
Responses Re: Avoiding bad prepared-statement plans.
List pgsql-hackers
On Tue, 16 Feb 2010 15:22:00 +0100, Greg Stark <gsstark@mit.edu> wrote:

> There's a second problem though. We don't actually know how long any
> given query is going to take to plan or execute. We could just
> remember how long it took to plan and execute last time or how long it
> took to plan last time and the average execution time since we cached
> that plan. Perhaps we should track the stddev of the execution plan,
> or the max execution time of the plan? Ie there are still unanswered
> questions about the precise heuristic to use but I bet we can come up
> with something reasonable.

This could be an occasion to implement plan caching...

Web 2.0 = AJAX means less need for heavy webpage reloads with (usually)  
lots of queries, and more small simple queries like selects returning 1 or  
a few rows every time the user clicks on something.

See benchmark here : (PG 8.4.2, MYSQL 5.1.37)
http://purity.bobfuck.net/posts/postgres/2010-02-Prep/

If prepared statements are used, MySQL is not faster for "small, simple  
selects"...
However, when not using prepared statements, most of the postmaster CPU  
time is spent parsing & planning.

Problem with prepared statements is they're a chore to use in web apps,  
especially PHP, since after grabbing a connection from the pool, you don't  
know if it has prepared plans in it or not.

The postmaster could keep a hash of already prepared plans, using the  
$-parameterized query as a hash key, and when it receives parse+bind  
message, look up in this cache and fetch plans for the query, avoiding  
planning entirely.

This could be done by the connection pooler too, but it doesn't have the  
information to decide wether it's wise to cache a plan or not.

Of course all the subtility is to determine if the plan is reusable with  
other parameters...

- after planning and executing the query, only cache it if the plan time  
is a significant part of the query time (as said previously).
- only simple queries should be automatically cached like this
- perhaps some measure of "plan volatility" ? For the examples I give in  
the link above, it's quite easy at least in 2 of the cases : searching  
UNIQUE columns can't return more than 1 row, so volatility is zero. It  
only depends on the table size.


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: A thought: should we run pgindent now?
Next
From: Tom Lane
Date:
Subject: Re: A thought: should we run pgindent now?