Re: cache estimates, cache access cost - Mailing list pgsql-hackers

From Cédric Villemain
Subject Re: cache estimates, cache access cost
Date
Msg-id BANLkTi=OpxUTfM9i8sw2uF4Bhx1J13CWSQ@mail.gmail.com
Whole thread Raw
In response to Re: cache estimates, cache access cost  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
2011/5/19 Robert Haas <robertmhaas@gmail.com>:
> On Thu, May 19, 2011 at 8:19 AM, Cédric Villemain
> <cedric.villemain.debian@gmail.com> wrote:
>> 2011/5/19 Robert Haas <robertmhaas@gmail.com>:
>>> On Tue, May 17, 2011 at 6:11 PM, Cédric Villemain
>>> <cedric.villemain.debian@gmail.com> wrote:
>>>> The point is to get ratio in cache, not the distribution of the data
>>>> in cache (pgfincore also allows you to see this information).
>>>> I don't see how a stable (a server in production) system can have its
>>>> ratio moving up and down so fast without known pattern.
>>>
>>> Really?  It doesn't seem that hard to me.  For example, your nightly
>>> reports might use a different set of tables than are active during the
>>> day....
>>
>> yes, this is known pattern, I believe we can work with it.
>
> I guess the case where I agree that this would be relatively static is
> on something like a busy OLTP system.  If different users access
> different portions of the main tables, which parts of each relation
> are hot might move around, but overall the percentage of that relation
> in cache probably won't move around a ton, except perhaps just after
> running a one-off reporting query, or when the system is first
> starting up.

yes.

>
> But that's not everybody's workload.  Imagine a system that is
> relatively lightly used.  Every once in a while someone comes along
> and runs a big reporting query.  Well, the contents of the buffer
> caches are might vary considerably depending on *which* big reporting
> queries ran most recently.

Yes, I agree. This scenario is for the case where oscache_percent and
pgcache_percent are subject to change I guess. We can defined 1/ if
the values can/need to be change 2/ when update the values. For 2/ the
database usage may help to trigger an ANALYZE when required. But to be
honest I'd like to hear more of the strategy suggested by Greg here.

Those scenari are good keep in mind to build good indicators for both
the plugin to do the ANALYZE and to solve 2/

>
> Also, even if we knew what was going to be in cache at the start of
> the query, the execution of the query might change things greatly as
> it runs.  For example, imagine a join between some table and itself.
> If we estimate that none of the data is i cache, we will almost
> certainly be wrong, because it's likely both sides of the join are
> going to access some of the same pages.  Exactly how many depends on
> the details of the join condition and whether we choose to implement
> it by merging, sorting, or hashing.  But it's likely going to be more
> than zero.  This problem can also arise in other contexts - for
> example, if a query accesses a bunch of large tables, the tables that
> are accessed later in the computation might be less cached than the
> ones accessed earlier in the computation, because the earlier accesses
> pushed parts of the tables accessed later out of cache.

Yes I believe the Mackert and Lohman formula has been good so far and
I didn't suggest at any moment to remove it.
It will need some rewrite to handle it with the new GUC and new
pg_class columns but the code is already in the place for that.

> Or, if a
> query requires a large sort, and the value of work_mem is very high
> (say 1GB), the sort might evict data from cache.  Now maybe none of
> this matters a bit in practice, but it's something to think about.

Yes I agree again.

>
> There was an interesting report on a problem along these lines from
> Kevin Grittner a while back.  He found he needed to set seq_page_cost
> and random_page_cost differently for the database user that ran the
> nightly reports, precisely because the degree of caching was very
> different than it was for the daily activity, and he got bad plans
> otherwise.

this is in fact a very interesting use case.  I believe the same
strategy can be applied and update cache_page_cost and pg_class.
But I really like if it closes this use case: seq_page_cost,
random_page_cost and cache_page_cost must not need to be changed, they
should be more 'hardware dependent'. What will need to be changed is
in fact the frequency of ANALYZE CACHE in such case (or arbitrary set
values). It should allow the planner and costsize functions to have
accurate values and provide the best plan (again, the cache estimation
coming from the running query remain in the hands of the Mackert and
Lohman).
OK, maybe the user will have to write some ANALYZE CACHE; between some
queries in his scenarios.

Maybe a good scenario to add to the performance farm ? (as others but
this one has the very good value to be a production case)

I'll write those scenarios in a wiki page so it can be used to review
corner cases and possible issues (not now, it is late here).

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: inconvenient compression options in pg_basebackup
Next
From: "MauMau"
Date:
Subject: Re: Cannot build docs of 9.1 on Windows