Re: patch for new feature: Buffer Cache Hibernation - Mailing list pgsql-hackers

From Mitsuru IWASAKI
Subject Re: patch for new feature: Buffer Cache Hibernation
Date
Msg-id 20110507.022228.83883502.iwasaki@jp.FreeBSD.org
Whole thread Raw
In response to Re: patch for new feature: Buffer Cache Hibernation  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
List pgsql-hackers
Hi, thanks for your comments!
I'm glad to discuss about this topic.

>  * pgfadv_WILLNEED
>  * pgfadv_WILLNEED_snapshot
> 
> The former ask to load each segment of a relation *but* the kernel can
> decide to not do that or load only part of each segment. (so it is not
> as brutal as cat file > /dev/null )
> The later read *exactly* each blocks required in each segment, not all
> blocks except if all were in cache while doing the snapshot. (this one
> is the part of the snapshot/restore combo)

Sorry about that, I'm not so familiar with posix_fadvise().
I'll check posix_fadvise() later.
Actually I used to execute 'cat database_file > /dev/null' script on
other DBSM before starting.
# or 'select /*+ INDEX(emp emp_pk) */ count(*) from emp;' to load
# index blocks

> I may prefer the per relation approach (so you can snapshot and
> restore only the interesting tables/index). Given what I read in your
> patch it looks easy to do, isn't it ?

I would like to keep my patch as simple as possible, because
it is just a hibernation function, not complicated buffer management.
But I want to try improving buffer management on next vacation.
# currently I'm in 11-days vacation until Sunday.

My rough idea on improving buffer management like this;
SQL> alter table table_name buffer pin priority 7;
SQL> alter index index_name buffer pin priority 10;

This DDL set 'buffer pin priority' property to table/index and
also buffer descriptors related with table/index.
Optionally preloading database files in FS cache and relation blocks
in DB cache would be possible.

When new buffer is required, buffer manager refer to the priority in
each buffers and select a victim buffer.

I think it helps batch job runs in better buffer cache condition
by giving hints for buffer management.
For example, job-A reads table_A, index_A and writes only table_B;
SQL> alter table table_A buffer pin priority 7;
SQL> alter index index_A buffer pin priority 10;
SQL> alter table table_B buffer pin priority 1;
keeps buffers of index_A, table_A (table_B will be victims soon).

Buffer pin priority can be reset like this;
SQL> alter system buffer pin priority 5;

Next job-B reads and writes table_C, reads index_C with preloading;
SQL> alter table table_C buffer pin priority 5;
SQL> alter index index_C buffer pin priority 10 with preloading 50%;
something like this.

> I also prefer the idea to keep a map of the Buffer Cache (yes, like
> what I do with pgfincore) than storing the data directly and reading
> it directly. This later part semmes a bit dangerous to me, even if it
> looks sane from a normal postgresql stop/start process.

Never mind :)
I added enough validations and will add more.

> better than me, and anyway your patch remain very easy to read in all case.

Thanks a lot!  My policy on experimental implementation is easy-to-read
so that people understand my idea quickly.
That's why my first patch doesn't have enough error checkings ;)

Thanks




pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [pgsql-advocacy] New Canadian nonprofit for trademark, postgresql.org domain, etc.
Next
From: Robert Haas
Date:
Subject: Re: Why not install pgstattuple by default?