Re: Per-table random_page_cost for tables that we know are always cached - Mailing list pgsql-hackers

From PFC
Subject Re: Per-table random_page_cost for tables that we know are always cached
Date
Msg-id op.t92ewrzscigqcu@apollo13.peufeu.com
Whole thread Raw
In response to Re: Per-table random_page_cost for tables that we know are always cached  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-hackers
>> Example : let's imagine a "cache priority" setting.
>
> Which we can presume the DBA will set incorrectly because the tools  
> needed to set that right aren't easy to use.
LOL, yes.

> Jim threw out that you can just look at the page hit percentages  
> instead. That's not completely true.  If you've had some nasty query  
> blow out your buffer cache, or if the server has been up a looong time  
> and the total stas don't really reflect recent reality, what's in the  
> buffer cache and what the stats say have been historical cached can  
> diverge.
Yes :
- perform huge query on table A
- table A is now in cache
- perform huge query on table B
- table B is now in cache, A isn't
- perform huge query on table A again
- postgres still thinks table A is cached and chooses a bad plan

>> This would not examine whatever is in the OS' cache, though.
Yeah, but now that shared_buffers can be set to a large part of physical  
RAM, does it still matters ?Point is, postgres knows what is in the shared_buffers, so it can make a  
good decision. Postgres doesn't know what the OS has in cache, so it could  
only make a wild guess. I would rather err on the side of safety...

>
> I don't know that it's too unrealistic to model the OS as just being an  
> extrapolated bigger version of the buffer cache.  I can think of a  
> couple of ways those can diverge:
>
> 1) Popular pages that get high usage counts can end up with a higher  
> representation in shared_buffers than the OS
>
> 2) If you've being doing something like a bulk update, you can have lots  
> of pages that have been written recently in the OS cache that aren't  
> really accounted for fully in shared_buffers, because they never get a  
> high enough usage count to stay there (only used once) but can fill the  
> OS cache as they're spooled up to write.
Especially on CHECKPOINT


pgsql-hackers by date:

Previous
From: Zoltan Boszormenyi
Date:
Subject: Re: [RFC] Localized literals
Next
From: "Zeugswetter Andreas OSB SD"
Date:
Subject: Re: Per-table random_page_cost for tables that we know are always cached