Priority table or Cache table - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Priority table or Cache table
Date
Msg-id CAJrrPGdz=RvBKWY8-hZWzOgjH5uHCiNFyQbLxiPVRhBDKPmi3g@mail.gmail.com
Whole thread Raw
Responses Re: Priority table or Cache table  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

I want to propose a new feature called "priority table" or "cache table". This is same as regular table except the pages of these tables are having high priority than normal tables. These tables are very useful, where a faster query processing on some particular tables is expected.

The same faster query processing can be achieved by placing the tables on a tablespace of ram disk. In this case there is a problem of data loss in case of system shutdown. To avoid this there is a need of continuous backup of this tablespace and WAL files is required. The priority table feature will solve these problems by providing the similar functionality.

User needs a careful decision in deciding how many tables which require a faster access, those can be declared as priority tables and also these tables should be in small in both number of columns and size.


New syntax:

create [priority] Table ...;

or

Create Table .. [ buffer_pool = priority | default ];

By adding a new storage parameter of buffer_pool to specify the type of buffer pool this table can use.

The same can be extended for index also.


Solution -1:

This solution may not be a proper one, but it is simple. So while placing these table pages into buffer pool, the usage count is changed to double max buffer usage count instead of 1 for normal tables. Because of this reason there is a less chance of these pages will be moved out of buffer pool. The queries which operates on these tables will be faster because of less I/O. In case if the tables are not used for a long time, then only the first query on the table will be slower and rest of the queries are faster.

Just for test, a new bool member can be added to RELFILENODE structure to indicate the table type is priority or not. Using this while loading the page the usage count can be modified.

The pg_buffercache output of a priority table:

postgres=# select * from pg_buffercache where relfilenode=16385;
 bufferid | relfilenode | reltablespace | reldatabase | relforknumber | relblocknumber | isdirty | usagecount
-----------+---------------+-------------------+-----------------+--------------------+---------------------+---------+------------
       270 |        16385 |               1663 |          12831 |                       0 |                          0 |    t       |         10


Solution - 2:

By keeping an extra flag in the buffer to know whether the buffer is used for a priority table or not? By using this flag while replacing a buffer used for priority table some extra steps needs to be taken care like
1. Only another page of priority table can replace this priority page.
2. Only after at least two complete cycles of clock sweep, a normal table page can replace this.

In this case the priority buffers are present in memory for long time as similar to the solution-1, but not guaranteed always.


Solution - 3:

Create an another buffer pool called "priority buffer pool" similar to shared buffer pool to place the priority table pages. A new guc parameter called "priority_buffers" can be added to the get the priority buffer pool size from the user. The Maximum limit of these buffers can be kept smaller value to make use of it properly.

As an extra care, whenever any page needs to move out of the priority buffer pool a warning is issued, so that user can check whether the configured the priority_buffers size is small or the priority tables are grown too much as not expected?

In this case all the pages are always loaded into memory thus the queries gets the faster processing.

IBM DB2 have the facility of creating one more buffer pools and fixing specific tables and indexes into them. Oracle is also having a facility to specify the buffer pool option as keep or recycle.

I am preferring syntax-2 and solution-3. please provide your suggestions/improvements.

Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: should we add a XLogRecPtr/LSN SQL type?
Next
From: Tom Lane
Date:
Subject: Re: narwhal and PGDLLIMPORT