Re: Per-table random_page_cost for tables that we know are always cached - Mailing list pgsql-hackers

From Greg Smith
Subject Re: Per-table random_page_cost for tables that we know are always cached
Date
Msg-id Pine.GSO.4.64.0804230111220.22837@westnet.com
Whole thread Raw
In response to Per-table random_page_cost for tables that we know are always cached  (PFC <lists@peufeu.com>)
Responses Re: Per-table random_page_cost for tables that we know are always cached  (PFC <lists@peufeu.com>)
Re: Per-table random_page_cost for tables that we know are always cached  (Decibel! <decibel@decibel.org>)
List pgsql-hackers
On Tue, 22 Apr 2008, PFC wrote:

> Example : let's imagine a "cache priority" setting.

Which we can presume the DBA will set incorrectly because the tools needed 
to set that right aren't easy to use.

> An alternative would be for the background writer to keep some stats and do 
> the thing for us :
> - begin bgwriter scan
> - setup hashtable of [relid => page count]
> - at each page that is scanned, increment "page count" for this 
> relation...

I've already got a plan sketched out that does this I didn't manage to get 
finished in time for 8.3.  What I wanted it for was not for this purpose, 
but for instrumentation of what's in the cache that admins can look at. 
Right now you can get that out pg_buffercache, but that's kind of 
intrusive because of the locks it takes.  In many cases I'd be perfectly 
happy with an approximation of what's inside the buffer cache, accumulated 
while the page header is being locked anyway as the BGW passed over it. 
And as you note having this data available can be handy for internal 
self-tuning as well once it's out there.

Jim threw out that you can just look at the page hit percentages instead. 
That's not completely true.  If you've had some nasty query blow out your 
buffer cache, or if the server has been up a looong time and the total 
stas don't really reflect recent reality, what's in the buffer cache and 
what the stats say have been historical cached can diverge.

> This would not examine whatever is in the OS' cache, though.

I don't know that it's too unrealistic to model the OS as just being an 
extrapolated bigger version of the buffer cache.  I can think of a couple 
of ways those can diverge:

1) Popular pages that get high usage counts can end up with a higher 
representation in shared_buffers than the OS

2) If you've being doing something like a bulk update, you can have lots 
of pages that have been written recently in the OS cache that aren't 
really accounted for fully in shared_buffers, because they never get a 
high enough usage count to stay there (only used once) but can fill the OS 
cache as they're spooled up to write.

I'm not sure that either of these cases are so strong they invalidate your 
basic idea though.  There's a pending 8.4 TODO to investigate whether 
increasing the maximum usage count a buffer can get would be an 
improvement.  If that number got bumped up I could see (2) become more of 
a problem.

I'd be a somewhat concerned about turning this mechanism on by default 
though, at least at first.  A hybrid approach that gives the DBA some 
control might work well.  Maybe have an "adjust estimates for cache 
contents" knob that you can toggle on a per-session or per-table basis?

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: WIP: psql default banner patch
Next
From: "Gurjeet Singh"
Date:
Subject: Re: RECORD.* doesn't work in Pl/PGSQL