Re: Extra XLOG in Checkpoint for StandbySnapshot - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Extra XLOG in Checkpoint for StandbySnapshot
Date
Msg-id 00b801cdee6f$a488c620$ed9a5260$@kapila@huawei.com
Whole thread Raw
In response to Re: Extra XLOG in Checkpoint for StandbySnapshot  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Extra XLOG in Checkpoint for StandbySnapshot
List pgsql-hackers
On Wednesday, January 09, 2013 5:49 PM Andres Freund wrote:
> On 2013-01-09 15:06:04 +0530, Amit Kapila wrote:
> > On Wednesday, January 09, 2013 2:28 PM Andres Freund wrote:
> > > On 2013-01-09 14:04:32 +0530, Amit Kapila wrote:
> > > > On Tuesday, January 08, 2013 8:57 PM Andres Freund wrote:
> > > > > On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:
> > > > > > On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:
> > > > > > > On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:
> > > > > > > > On Monday, January 07, 2013 7:15 PM Andres Freund wrote:
> > > > > > > > > On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:
> > > > > > > > > > On Monday, January 07, 2013 6:30 PM Simon Riggs
> wrote:
> > > > > > > > > > > On 7 January 2013 12:39, Amit Kapila
> > > > > <amit.kapila@huawei.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The information that no transactions are currently
> running
> > > > > allows
> > > > > > > you
> > > > > > > > > to
> > > > > > > > > build a recovery snapshot, without that information the
> > > standby
> > > > > > > won't
> > > > > > > > > start answering queries. Now that doesn't matter if all
> > > > > standbys
> > > > > > > > > already
> > > > > > > > > have built a snapshot, but the primary cannot know
> that.
> > > > > > > >
> > > > > > > > Can't we make sure that checkpoint operation doesn't
> happen
> > > for
> > > > > below
> > > > > > > conds.
> > > > > > > > a. nothing has happened during or after last checkpoint
> > > > > > > > OR
> > > > > > > > b. nothing except snapshotstanby WAL has happened
> > > > > > > >
> > > > > > > > Currently it is done for point a.
> > > > > > > >
> > > > > > > > > Having to issue a checkpoint while ensuring
> transactions
> > > are
> > > > > > > running
> > > > > > > > > just to get a standby up doesn't seem like a good idea
> to
> > > me :)
> > > > > > > >
> > > > > > > > Simon:
> > > > > > > > > If you make the correct test, I'd be more inclined to
> > > accept
> > > > > the
> > > > > > > premise.
> > > > > > > >
> > > > > > > > Not sure, what exact you are expecting from test?
> > > > > > > > The test is do any one operation on system and then keep
> the
> > > > > system
> > > > > > > idle.
> > > > > > > > Now at each checkpoint interval, it logs WAL for
> > > SnapshotStandby.
> > > > > > >
> > > > > > > I can't really follow what you want to do here. The
> snapshot is
> > > > > only
> > > > > > > logged if a checkpoint is performed anyway?  As recovery
> starts
> > > at
> > > > > (the
> > > > > > > logical) checkpoint's location we need to log a snapshot
> > > exactly
> > > > > > > there. If you want to avoid activity when the system is
> idle
> > > you
> > > > > need
> > > > > > > to
> > > > > > > prevent checkpoints from occurring itself.
> > > > > >
> > > > > > Even if the checkpoint is scheduled, it doesn't perform
> actual
> > > > > operation if
> > > > > > there's nothing logged between
> > > > > > current and previous checkpoint due to below check in
> > > > > CreateCheckPoint()
> > > > > > function.
> > > > > > if (curInsert == ControlFile->checkPoint +
> > > > > >                         MAXALIGN(SizeOfXLogRecord +
> > > > > sizeof(CheckPoint)) &&
> > > > > >                         ControlFile->checkPoint ==
> > > > > > ControlFile->checkPointCopy.redo)
> > > > > >
> > > > > > But if we set the wal_level as hot_standby, it will log
> snapshot,
> > > now
> > > > > next
> > > > > > time again when function CreateCheckPoint()
> > > > > > will get called due to scheduled checkpoint, the above check
> will
> > > > > fail and
> > > > > > it will again log snapshot, so this will continue, even if
> the
> > > system
> > > > > is
> > > > > > totally idle.
> > > > > > I understand that it doesn't cause any problem, but I think
> it is
> > > > > better if
> > > > > > the repeated log of snapshot in this scenario can be avoided.
> > > > >
> > > > > ISTM in that case you "just" need a way to cope with the
> > > additionally
> > > > > logged record in the above piece of code. Not logging seems to
> be
> > > the
> > > > > entirely wrong way to go at this.
> > > >
> > > > I think one of the ways code can be modified is as below:
> > > >
> > > > +            /*size of running transactions log when there is
> no
> > > > active transation*/
> > > > +                if (!shutdown && XLogStandbyInfoActive())
> > > > +                {
> > > > +                        runningXactXLog =
> > > > MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord;
> > > > +                }
> > > >
> > > > !                if (curInsert == ControlFile->checkPoint +
> > > > !                        MAXALIGN(SizeOfXLogRecord +
> > > sizeof(CheckPoint)) &&
> > > > !                        ControlFile->checkPoint ==
> > > > ControlFile->checkPointCopy.redo)
> > > >
> > > > !                if (curInsert == ControlFile->checkPoint +
> > > > !                        MAXALIGN(SizeOfXLogRecord +
> > > sizeof(CheckPoint)) &&
> > > > !                        ControlFile->checkPoint ==
> > > > ControlFile->checkPointCopy.redo + runningXactXLog)
> > > >
> > > > Second condition is checking the last checkpoint WAL position
> with
> > > the
> > > > current one.
> > > > Since  ControlFile->checkPointCopy.redo holds the value before
> > > "running
> > > > Xact" WAL was inserted
> > > > and ControlFile->checkPoint holds the value after "running Xact"
> WAL
> > > got
> > > > inserted, so if no new WAL was inserted apart from "running
> Xacts"
> > > and
> > > > "Checkpoint" WAL, then this condition will be true.
> > > >
> > >
> > > I don't think thats safe, there could have been another record
> inserted
> > > that happens to be MinSizeOfXactRunningXacts big and we would still
> > > skip the checkpoint.
> >
> > I think such can happen only for when first time checkpoint is
> triggered,
> > and even then the first part of the check (curInsert ==
> > ControlFile->checkPoint + MAXALIGN(SizeOfXLogRecord +
> sizeof(CheckPoint))
> > will fail.
> >
> > Value to runningXactXLog will be assigned only if wal_level is
> hot_stanby.
> > In that case if checkpoint is getting scheduled for 2nd or
> consecutive time,
> > it will include WAL for "running Xact" along with WAL for any other
> data.
> > So now even if the other data is of size MinSizeOfXactRunningXacts,
> the
> > check should fail and skip the checkpoint.
> 
> Hm. The locking around checkpoints probably prevents the case I was
> worried about in combination with the wal_level not changing while
> running.

In that case, can we consider this as a patch for Commit.
If there is no objection, shall I prepare bug-fix patch for the scenario
reported.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH 1/2] Provide a common malloc wrappers and palloc et al. emulation for frontend'ish environs
Next
From: Andres Freund
Date:
Subject: Re: [PATCH 1/2] Provide a common malloc wrappers and palloc et al. emulation for frontend'ish environs