Proposing pg_hibernate - Mailing list pgsql-hackers

From Gurjeet Singh
Subject Proposing pg_hibernate
Date
Msg-id CABwTF4Ui_anAG+ybseFunAH5Z6DE9aw2NPdy4HryK+M5OdXCCA@mail.gmail.com
Whole thread Raw
Responses Re: Proposing pg_hibernate  (Gurjeet Singh <gurjeet@singh.im>)
Re: Proposing pg_hibernate  (Cédric Villemain <cedric@2ndquadrant.com>)
List pgsql-hackers
Please find attached the pg_hibernate extension. It is a set-it-and-forget-it solution to enable hibernation of Postgres shared-buffers. It can be thought of as an amalgam of pg_buffercache and pg_prewarm.

It uses the background worker infrastructure. It registers one worker process (BufferSaver) to save the shared-buffer metadata when server is shutting down, and one worker per database (BlockReader) when restoring the shared buffers.

It stores the buffer metadata under $PGDATA/pg_database/, one file per database, and one separate file for global objects. It sorts the list of buffers before storage, so that when it encounters a range of consecutive blocks of a relation's fork, it stores that range as just one entry, hence reducing the storage and I/O overhead.

On-disk binary format, which is used to create the per-database save-files, is defined as:
1. One or more relation filenodes; stored as r<relfilenode>.
2. Each realtion is followed by one or more fork number; stored as f<forknumber>
3. Each fork number is followed by one or more block numbers; stored as b<blocknumber>
4. Each block number is followed by zero or more range numbers; stored as N<number>

  {r {f {b N* }+ }+ }+

Possible enhancements:
- Ability to save/restore only specific databases.
- Control how many BlockReaders are active at a time; to avoid I/O storms.
- Be smart about lowered shared_buffers across the restart.
- Different modes of reading like pg_prewarm does.
- Include PgFincore functionality, at least for Linux platforms.

The extension currently works with PG 9.3, and may work on 9.4 without any changes; I haven't tested, though.  If not, I think it'd be easy to port to HEAD/PG 9.4. I see that 9.4 has put a cap on maximum background workers via a GUC, and since my aim is to provide a non-intrusive no-tuning-required extension, I'd like to use the new dynamic-background-worker infrastructure in 9.4, which doesn't seem to have any preset limits (I think it's limited by max_connections, but I may be wrong). I can work on 9.4 port, if there's interest in including this as a contrib/ module.

To see the extension in action:

.) Compile it.
.) Install it.
.) Add it to shared_preload_libraries.
.) Start/restart Postgres.
.) Install pg_buffercache extension, to inspect the shared buffers.
.) Note the result of pg_buffercache view.
.) Work on your database to fill up the shared buffers.
.) Note the result of pg_buffercache view, again; there should be more blocks than last time we checked.
.) Stop and start the Postgres server.
.) Note the output of pg_buffercache view; it should contain the blocks seen just before the shutdown.
.) Future server restarts will automatically save and restore the blocks in shared-buffers.

The code is also available as Git repository at https://github.com/gurjeet/pg_hibernate/

Demo:

$ make -C contrib/pg_hibernate/
$ make -C contrib/pg_hibernate/ install
$ vi $B/db/data/postgresql.conf
$ grep shared_preload_libraries $PGDATA/postgresql.conf
shared_preload_libraries = 'pg_hibernate'   # (change requires restart)

$ pgstart
waiting for server to start.... done
server started

$ pgsql -c 'create extension pg_buffercache;'
CREATE EXTENSION

$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not null group by reldatabase;'
 count
-------
   163
    14
(2 rows)

$ pgsql -c 'create table test_hibernate as select s as a, s::char(1000) as b from generate_series(1, 100000) as s;'
SELECT 100000

$ pgsql -c 'create index on test_hibernate(a);'
CREATE INDEX

$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not null group by reldatabase;'
 count
-------
  2254
    14
(2 rows)

$ pgstop
waiting for server to shut down....... done
server stopped

$ pgstart
waiting for server to start.... done
server started

$ pgsql -c 'select count(*) from pg_buffercache where relblocknumber is not null group by reldatabase;'
 count
-------
  2264
    17
(2 rows)

There are a few more blocks than the time they were saved, but all the blocks from before the restart are present in shared buffers after the restart.

Best regards,
Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [COMMITTERS] pgsql: Include planning time in EXPLAIN ANALYZE output.
Next
From: Peter Geoghegan
Date:
Subject: Re: Failure while inserting parent tuple to B-tree is not fun