Home > mailing lists

Re: Automatic function replanning - Mailing list pgsql-hackers

From	Jim C. Nasby
Subject	Re: Automatic function replanning
Date	December 22, 2005 19:04:01
Msg-id	20051222200343.GI72143@pervasive.com Whole thread Raw
In response to	Re: Automatic function replanning (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses	Re: Automatic function replanning Re: Automatic function replanning
List	pgsql-hackers

Tree view

On Wed, Dec 21, 2005 at 11:00:31PM -0500, Bruce Momjian wrote:
> > Track normal resource consumption (ie: tuples read) for planned queries
> > and record parameter values that result in drastically different
> > resource consumption.
> > 
> > This would at least make it easy for admins to identify prepared queries
> > that have a highly variable execution cost.
> 
> We have that TODO already:
> 
>     * Log statements where the optimizer row estimates were dramatically
>       different from the number of rows actually found?

Does the stored plan also save how many rows were expected? Otherwise
I'm not sure how that TODO covers it... If it does then please ignore my
ramblings below. :)

My idea has nothing to do with row estimates. It has to do with the
amount of work actually done to perform a query. Consider this example:

CREATE TABLE queue (status char NOT NULL, queue_item text NOT NULL);
CREATE INDEX queue__status ON queue (status);

Obviously, to process this you'll need a query like:
SELECT * FROM queue WHERE status='N' -- N for New;

Say you also occasionally need to see a list of items that have been
processed:
SELECT * FROM queue WHERE status='D' -- D for Done;

And let's say you need to keep done items around for 30 days.

Now, if both of these are done using a prepared statement, it's going to
look like:

SELECT * FROM queue WHERE status='?';

If the first one to run is the queue processing one, the planner will
probably choose the index. This means that when we're searching on 'N',
there will be a fairly small number of tuples read to execute the query,
but when searching for 'D' a very large number of tuples will be read.

What I'm proposing is to keep track of the 'normal' number of tuples
read when executing a prepared query, and logging any queries that are
substantially different. So, if you normally have to read 50 tuples to
find all 'N' records, when the query looking for 'D' records comes along
and has to read 5000 tuples instead, we want to log that. Probably the
easiest way to accomplish this is to store a moving average of tuples
read with each prepared statement entry.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

pgsql-hackers by date:

From: Alvaro Herrera
Date: 22 December 2005, 19:02:48
Subject: Re: Unsplitting btree index leaf pages

From: Simon Riggs
Date: 22 December 2005, 19:10:57
Subject: Re: WAL bypass for INSERT, UPDATE and DELETE?

Re: Automatic function replanning - Mailing list pgsql-hackers

Previous

Next