Re: Proposing pg_hibernate - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Proposing pg_hibernate
Date
Msg-id CAA4eK1Kue1iPWGnshJZtd5i0ub1M=g1mr-tfZbhdyuoOrHkUGA@mail.gmail.com
Whole thread Raw
In response to Re: Proposing pg_hibernate  (Gurjeet Singh <gurjeet@singh.im>)
Responses Re: Proposing pg_hibernate
List pgsql-hackers
On Fri, Jun 6, 2014 at 5:31 PM, Gurjeet Singh <gurjeet@singh.im> wrote:
> On Thu, Jun 5, 2014 at 11:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

> > Buffer saver process itself can crash while saving or restoring
> > buffers.
>
> True. That may lead to partial list of buffers being saved. And the
> code in Reader process tries hard to read only valid data, and punts
> at the first sight of data that doesn't make sense or on ERROR raised
> from Postgres API call.

Inspite of Reader Process trying hard, I think we should ensure by
some other means that file saved by buffer saver is valid (may be
first write in tmp file and then rename it or something else).

> > IIUC on shutdown request, postmaster will send signal to BG Saver
> > and BG Saver will save the buffers and then postmaster will send
> > signal to checkpointer to shutdown.  So before writing Checkpoint
> > record, BG Saver can crash (it might have saved half the buffers)
>
> Case handled as described above.
>
> > or may BG saver saves buffers, but checkpointer crashes (due to
> > power outage or any such thing).
>
> Checkpointer process' crash seems to be irrelevant to Postgres
> Hibernator's  workings.

Yeap, but if it crashes before writing checkpoint record, it will lead to
recovery which is what we are considering.

> I think you are trying to argue the wording in my claim "save-files
> are created only on clean shutdowons; not on a crash or immediate
> shutdown", by implying that a crash may occur at any time during and
> after the BufferSaver processing. I agree the wording can be improved.

Not only wording, but in your above mail Case 2 and 1b would need to
load buffer's and perform recovery as well, so we need to decide which
one to give preference.

So If you agree that we should have consideration for recovery data
along with saved files data, then I think we have below options to
consider:

1. Have an provision for user to specify which data (recovery or
previous cached blocks) should be considered more important
and then load buffers before or after recovery based on that
input.

2. Always perform before recovery and mention in docs that users
can expect more time for servers to start in case they enable this
extension along with the advantages of the same.

3. Always perform after recovery and mention in docs that enabling
this extension might discard cached data by recovery or initial few
operations done by user.

4. Have an exposed API by BufMgr module such that Buffer loader
will only consider buffers in freelist to load buffers.

Based on opinion of others, I think we can decide on one of these
or if any other better way.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: "cancelling statement due to user request error" occurs but the transaction has committed.
Next
From: David Rowley
Date:
Subject: Allowing NOT IN to use ANTI joins