Re: PITR, checkpoint, and local relations - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PITR, checkpoint, and local relations
Date
Msg-id 14296.1028315110@sss.pgh.pa.us
Whole thread Raw
In response to Re: PITR, checkpoint, and local relations  ("J. R. Nield" <jrnield@usol.com>)
List pgsql-hackers
"J. R. Nield" <jrnield@usol.com> writes:
> What would happen if a transaction with a local relation commits during
> backup, and there are log entries inserting the catalog tuples into
> pg_class. Should I not apply those on restore? How do I know?

This is certainly a non-problem.  You see a WAL log entry, you apply it.
Whether the transaction actually commits later is not your concern (at
least not at that point).

> This problem is subtle, and I'm maybe having difficulty explaining it
> properly. Do you understand the issue I'm raising? Have I made some kind
> of blunder, so that this is really not a problem? 

After thinking more, I think you are right, but you didn't explain it
well.  The problem is not really relevant to PITR at all, but is a hole
in the initial design of WAL.  Consider
transaction startstransaction creates local reltransaction writes in local rel...
CHECKPOINTtransactionwrites in local rel...                    CHECKPOINTtransaction writes in local rel...transaction
flusheslocal rel pages to disktransaction commits                    system crash
 

We'll try to replay the log from the latest checkpoint.  This works only
if all the local-rel page flushes actually made it to disk, otherwise
the updates of the local rel that happened before the last checkpoint
may be lost.  (I think there is still an fsync in local-rel commit to
ensure the flushes happen, but it's sure messy to do it that way.)

We could possibly fix this by logging the local-rel-flush page writes
themselves in the WAL log, but that'd probably more than ruin the
efficiency advantage of the local bufmgr.  So I'm back to the idea
that removing it is the way to go.  Certainly that would provide
nontrivial simplifications in a number of places (no tests on local vs
global buffer anymore, no special cases for local rel commit, etc).

Might be useful to temporarily dike it out and see what the penalty
for building a large index is.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Rod Taylor
Date:
Subject: Re: Why is MySQL more chosen over PostgreSQL?
Next
From: Tom Lane
Date:
Subject: Re: FUNC_MAX_ARGS benchmarks