Re: Priority table or Cache table - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Priority table or Cache table
Date
Msg-id CAFjFpRdz-6XB=XuU5CPSMe6MgQowmhwjcOPPdYWYpi3bXhchrg@mail.gmail.com
Whole thread Raw
In response to Re: Priority table or Cache table  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses Re: Priority table or Cache table
List pgsql-hackers



On Thu, Feb 20, 2014 at 10:23 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Thu, Feb 20, 2014 at 2:26 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > I want to propose a new feature called "priority table" or "cache
>> > table".
>> > This is same as regular table except the pages of these tables are
>> > having
>> > high priority than normal tables. These tables are very useful, where a
>> > faster query processing on some particular tables is expected.
>>
>> Why exactly does the existing LRU behavior of shared buffers not do
>> what you need?
>
>
> Lets assume a database having 3 tables, which are accessed regularly. The
> user is expecting a faster query results on one table.
> Because of LRU behavior which is not happening some times.

I think this will not be a problem for regularly accessed tables(pages),
as per current algorithm they will get more priority before getting
flushed out of shared buffer cache.
Have you come across any such case where regularly accessed pages
get lower priority than non-regularly accessed pages?

Because of other regularly accessed tables, some times the table which expects faster results is getting delayed.

The solution involving buffer pools partitions the buffer cache in separate pools explicitly. The way PostgreSQL buffer manager works, for a regular pattern table accesses the buffer cache automatically reaches a stable point where the number of buffers containing pages belonging to a particular table starts to stabilize. Thus at an equilibrium point for given access pattern, the buffer cache automatically gets partitioned by the tables, each using its share of buffers. So, solution using buffer pools seems useless.

PFA some scripts, which I used to verify the behaviour. The scripts create two tables, one large and other half it's size (buffer_usage_objects.sql). The other script contains few queries which will simulate a simple table access pattern by running select count(*) on these tables N times. The same script contains query of pg_buffercache view provided by pg_buffercache extension. This query counts the number of buffers uses by either of these tables. So, if you run three session in parallel, two querying either of the tables and the third taking snapshot of buffer usage per table, you would be able to see this partitioning.
 
 
However it might be required for cases where user wants to control
such behaviour and pass such hints through table level option or some
other way to indicate that he wants more priority for certain tables
irrespective
of their usage w.r.t other tables.

Now I think here important thing to find out is how much helpful it is for
users or why do they want to control such behaviour even when Database
already takes care of such thing based on access pattern.
 
Yes it is useful in cases where the application always expects the faster results whether the table is used regularly or not.

In such case, it might be valuable to see if we should play with the maximum usage parameter, which is set to 5 currently.
 54 #define BM_MAX_USAGE_COUNT  5

 
Regards,
Hari Babu
Fujitsu Australia



--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: inherit support for foreign tables
Next
From: "MauMau"
Date:
Subject: Re: [bug fix] pg_ctl fails with config-only directory