Re: [WIP] cache estimates, cache access cost - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [WIP] cache estimates, cache access cost
Date
Msg-id BANLkTi=bJTxii58qYekMkRr7+bhx20B4kA@mail.gmail.com
Whole thread Raw
In response to Re: [WIP] cache estimates, cache access cost  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
Responses Re: [WIP] cache estimates, cache access cost
List pgsql-hackers
On Tue, Jun 14, 2011 at 12:06 PM, Cédric Villemain
<cedric.villemain.debian@gmail.com> wrote:
>> 1. ANALYZE happens far too infrequently to believe that any data taken
>> at ANALYZE time will still be relevant at execution time.
>
> ANALYZE happens when people execute it, else it is auto-analyze and I
> am not providing auto-analyze-oscache.
> ANALYZE OSCACHE is just a very simple wrapper to update pg_class. The
> frequency is not important here, I believe.

Well, I'm not saying you have to have all the answers to post a WIP
patch, certainly.  But in terms of getting something committable, it
seems like we need to have at least an outline of what the long-term
plan is.  If ANALYZE OSCACHE is an infrequent operation, then the data
isn't going to be a reliable guide to what will happen at execution
time...

>> 2. Using data gathered by ANALYZE will make plans less stable, and our
>> users complain not infrequently about the plan instability we already
>> have, therefore we should not add more.

...and if it is a frequent operation then it's going to result in
unstable plans (and maybe pg_class bloat).  There's a fundamental
tension here that I don't think you can just wave your hands at.

> I was trying to split the patch size by group of features to reduce
> its size. The work is in progress.

Totally reasonable, but I can't see committing any of it without some
evidence that there's light at the end of the tunnel.  No performance
tests *whatsoever* have been done.  We can debate the exact amount of
evidence that should be required to prove that something is useful
from a performance perspective, but we at least need some.  I'm
beating on this point because I believe that the whole idea of trying
to feed this information back into the planner is going to turn out to
be something that we don't want to do.  I think it's going to turn out
to have downsides that are far larger than the upsides.  I am
completely willing to be be proven wrong, but right now I think this
will make things worse and you think it will make things better and I
don't see any way to bridge that gap without doing some measurements.

For example, if you run this patch on a system and subject that system
to a relatively even workload, how much do the numbers bounce around
between runs?  What if you vary the workload, so that you blast it
with OLTP traffic at some times and then run reporting queries at
other times?  Or different tables become hot at different times?

Once you've written code to make the planner do something with the
caching % values, then you can start to explore other questions.  Can
you generate plan instability, especially on complex queries, which
are more prone to change quickly based on small changes in the cost
estimates?  Can you demonstrate a workload where bad performance is
inevitable with the current code, but with your code, the system
becomes self-tuning and ends up with good performance?  What happens
if you have a large cold table with a small hot end where all activity
is concentrated?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: procpid?
Next
From: Jaime Casanova
Date:
Subject: Re: procpid?