Re: Extra XLOG in Checkpoint for StandbySnapshot - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Extra XLOG in Checkpoint for StandbySnapshot |
Date | |
Msg-id | 00b801cdee6f$a488c620$ed9a5260$@kapila@huawei.com Whole thread Raw |
In response to | Re: Extra XLOG in Checkpoint for StandbySnapshot (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Extra XLOG in Checkpoint for StandbySnapshot
|
List | pgsql-hackers |
On Wednesday, January 09, 2013 5:49 PM Andres Freund wrote: > On 2013-01-09 15:06:04 +0530, Amit Kapila wrote: > > On Wednesday, January 09, 2013 2:28 PM Andres Freund wrote: > > > On 2013-01-09 14:04:32 +0530, Amit Kapila wrote: > > > > On Tuesday, January 08, 2013 8:57 PM Andres Freund wrote: > > > > > On 2013-01-08 20:33:28 +0530, Amit Kapila wrote: > > > > > > On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote: > > > > > > > On 2013-01-08 19:51:39 +0530, Amit Kapila wrote: > > > > > > > > On Monday, January 07, 2013 7:15 PM Andres Freund wrote: > > > > > > > > > On 2013-01-07 19:03:35 +0530, Amit Kapila wrote: > > > > > > > > > > On Monday, January 07, 2013 6:30 PM Simon Riggs > wrote: > > > > > > > > > > > On 7 January 2013 12:39, Amit Kapila > > > > > <amit.kapila@huawei.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The information that no transactions are currently > running > > > > > allows > > > > > > > you > > > > > > > > > to > > > > > > > > > build a recovery snapshot, without that information the > > > standby > > > > > > > won't > > > > > > > > > start answering queries. Now that doesn't matter if all > > > > > standbys > > > > > > > > > already > > > > > > > > > have built a snapshot, but the primary cannot know > that. > > > > > > > > > > > > > > > > Can't we make sure that checkpoint operation doesn't > happen > > > for > > > > > below > > > > > > > conds. > > > > > > > > a. nothing has happened during or after last checkpoint > > > > > > > > OR > > > > > > > > b. nothing except snapshotstanby WAL has happened > > > > > > > > > > > > > > > > Currently it is done for point a. > > > > > > > > > > > > > > > > > Having to issue a checkpoint while ensuring > transactions > > > are > > > > > > > running > > > > > > > > > just to get a standby up doesn't seem like a good idea > to > > > me :) > > > > > > > > > > > > > > > > Simon: > > > > > > > > > If you make the correct test, I'd be more inclined to > > > accept > > > > > the > > > > > > > premise. > > > > > > > > > > > > > > > > Not sure, what exact you are expecting from test? > > > > > > > > The test is do any one operation on system and then keep > the > > > > > system > > > > > > > idle. > > > > > > > > Now at each checkpoint interval, it logs WAL for > > > SnapshotStandby. > > > > > > > > > > > > > > I can't really follow what you want to do here. The > snapshot is > > > > > only > > > > > > > logged if a checkpoint is performed anyway? As recovery > starts > > > at > > > > > (the > > > > > > > logical) checkpoint's location we need to log a snapshot > > > exactly > > > > > > > there. If you want to avoid activity when the system is > idle > > > you > > > > > need > > > > > > > to > > > > > > > prevent checkpoints from occurring itself. > > > > > > > > > > > > Even if the checkpoint is scheduled, it doesn't perform > actual > > > > > operation if > > > > > > there's nothing logged between > > > > > > current and previous checkpoint due to below check in > > > > > CreateCheckPoint() > > > > > > function. > > > > > > if (curInsert == ControlFile->checkPoint + > > > > > > MAXALIGN(SizeOfXLogRecord + > > > > > sizeof(CheckPoint)) && > > > > > > ControlFile->checkPoint == > > > > > > ControlFile->checkPointCopy.redo) > > > > > > > > > > > > But if we set the wal_level as hot_standby, it will log > snapshot, > > > now > > > > > next > > > > > > time again when function CreateCheckPoint() > > > > > > will get called due to scheduled checkpoint, the above check > will > > > > > fail and > > > > > > it will again log snapshot, so this will continue, even if > the > > > system > > > > > is > > > > > > totally idle. > > > > > > I understand that it doesn't cause any problem, but I think > it is > > > > > better if > > > > > > the repeated log of snapshot in this scenario can be avoided. > > > > > > > > > > ISTM in that case you "just" need a way to cope with the > > > additionally > > > > > logged record in the above piece of code. Not logging seems to > be > > > the > > > > > entirely wrong way to go at this. > > > > > > > > I think one of the ways code can be modified is as below: > > > > > > > > + /*size of running transactions log when there is > no > > > > active transation*/ > > > > + if (!shutdown && XLogStandbyInfoActive()) > > > > + { > > > > + runningXactXLog = > > > > MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord; > > > > + } > > > > > > > > ! if (curInsert == ControlFile->checkPoint + > > > > ! MAXALIGN(SizeOfXLogRecord + > > > sizeof(CheckPoint)) && > > > > ! ControlFile->checkPoint == > > > > ControlFile->checkPointCopy.redo) > > > > > > > > ! if (curInsert == ControlFile->checkPoint + > > > > ! MAXALIGN(SizeOfXLogRecord + > > > sizeof(CheckPoint)) && > > > > ! ControlFile->checkPoint == > > > > ControlFile->checkPointCopy.redo + runningXactXLog) > > > > > > > > Second condition is checking the last checkpoint WAL position > with > > > the > > > > current one. > > > > Since ControlFile->checkPointCopy.redo holds the value before > > > "running > > > > Xact" WAL was inserted > > > > and ControlFile->checkPoint holds the value after "running Xact" > WAL > > > got > > > > inserted, so if no new WAL was inserted apart from "running > Xacts" > > > and > > > > "Checkpoint" WAL, then this condition will be true. > > > > > > > > > > I don't think thats safe, there could have been another record > inserted > > > that happens to be MinSizeOfXactRunningXacts big and we would still > > > skip the checkpoint. > > > > I think such can happen only for when first time checkpoint is > triggered, > > and even then the first part of the check (curInsert == > > ControlFile->checkPoint + MAXALIGN(SizeOfXLogRecord + > sizeof(CheckPoint)) > > will fail. > > > > Value to runningXactXLog will be assigned only if wal_level is > hot_stanby. > > In that case if checkpoint is getting scheduled for 2nd or > consecutive time, > > it will include WAL for "running Xact" along with WAL for any other > data. > > So now even if the other data is of size MinSizeOfXactRunningXacts, > the > > check should fail and skip the checkpoint. > > Hm. The locking around checkpoints probably prevents the case I was > worried about in combination with the wal_level not changing while > running. In that case, can we consider this as a patch for Commit. If there is no objection, shall I prepare bug-fix patch for the scenario reported. With Regards, Amit Kapila.
pgsql-hackers by date: