Thread: Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Simon Riggs
Date:
On 5 April 2016 at 01:18, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Apr 4, 2016 at 4:50 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-04-04 08:44:47 +0100, Simon Riggs wrote:
>> That patch does exactly the same thing as the patch you prefer, just
>> does it differently;
>
> No, it doesn't; as explained above.

I think these few changes are all we need. (attached)
 

FWIW, I vote also for reverting this patch. This has been committed
without any real discussions..

Michael, its a shame to hear you say that, so let me give full context.

The patches under review in the CF are too invasive and not worth the trouble for such a minor problem. After full review, I would simply reject those patches (already discussed on list).

Rather than take that option, I went to the trouble of writing a patch that does the same thing but simpler, less invasive and more maintainable. Primarily, I did that for you, to avoid you having wasted your time and to allow you to backpatch a solution.

We can, if you wish, revert this patch. If we do, we will have nothing, since I object to the other patch(es).

My recommendation is that we apply the attached patch and leave it there.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment

Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Andres Freund
Date:
On 2016-04-06 09:18:54 +0100, Simon Riggs wrote:
> Rather than take that option, I went to the trouble of writing a patch that
> does the same thing but simpler, less invasive and more maintainable.
> Primarily, I did that for you, to avoid you having wasted your time and to
> allow you to backpatch a solution.

But it doesn't. It doesn't solve the longstanding problem of checkpoints
needlessly being repeated due to standby snapshots. It doesn't fix the
issue for for wal_level=logical. We now log more WAL with
XLogArchiveTimeout > 0 than without.

The other was an architectural fix, this is a selectively applied
bandaid.

Andres


Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Simon Riggs
Date:
On 6 April 2016 at 09:45, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-06 09:18:54 +0100, Simon Riggs wrote:
> Rather than take that option, I went to the trouble of writing a patch that
> does the same thing but simpler, less invasive and more maintainable.
> Primarily, I did that for you, to avoid you having wasted your time and to
> allow you to backpatch a solution.

But it doesn't. It doesn't solve the longstanding problem of checkpoints
needlessly being repeated due to standby snapshots.

<sigh> I can't see why you say this. I am willing to listen, but this appears to be wrong.
 
It doesn't fix the
issue for for wal_level=logical.

What issue is that? Previously you said it must not skip it at all for logical.
 
We now log more WAL with
XLogArchiveTimeout > 0 than without.

And the problem with that is what?
 
The other was an architectural fix, this is a selectively applied
bandaid.

It was an attempt at an architectural fix, which went wrong by being too much code and too invasive for such a minor issue.

I'm not much concerned with what emotive language you choose to support your arguments, but I am concerned about clear, maintainable code and I object to the previous patch.

There are other problems worthy of our attention and I will attend to those now.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Andres Freund
Date:
On 2016-04-06 10:04:42 +0100, Simon Riggs wrote:
> On 6 April 2016 at 09:45, Andres Freund <andres@anarazel.de> wrote:
>
> > On 2016-04-06 09:18:54 +0100, Simon Riggs wrote:
> > > Rather than take that option, I went to the trouble of writing a patch
> > that
> > > does the same thing but simpler, less invasive and more maintainable.
> > > Primarily, I did that for you, to avoid you having wasted your time and
> > to
> > > allow you to backpatch a solution.
> >
> > But it doesn't. It doesn't solve the longstanding problem of checkpoints
> > needlessly being repeated due to standby snapshots.

> <sigh> I can't see why you say this. I am willing to listen, but this
> appears to be wrong.

The issue there is that we continue to issue checkpoints if the only
activity since the last checkpoint was emitting a standby
snapshot. That's because:
    /*
     * If this isn't a shutdown or forced checkpoint, and we have not inserted
     * any XLOG records since the start of the last checkpoint, skip the
     * checkpoint.  The idea here is to avoid inserting duplicate checkpoints
     * when the system is idle. That wastes log space, and more importantly it
     * exposes us to possible loss of both current and previous checkpoint
     * records if the machine crashes just as we're writing the update.
     * (Perhaps it'd make even more sense to checkpoint only when the previous
     * checkpoint record is in a different xlog page?)
     *
     * If the previous checkpoint crossed a WAL segment, however, we create
     * the checkpoint anyway, to have the latest checkpoint fully contained in
     * the new segment. This is for a little bit of extra robustness: it's
     * better if you don't need to keep two WAL segments around to recover the
     * checkpoint.
     */
    if ((flags & (CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_END_OF_RECOVERY |
                  CHECKPOINT_FORCE)) == 0)
    {
        if (prevPtr == ControlFile->checkPointCopy.redo &&
            prevPtr / XLOG_SEG_SIZE == curInsert / XLOG_SEG_SIZE)
        {
            WALInsertLockRelease();
            LWLockRelease(CheckpointLock);
            END_CRIT_SECTION();
            return;
        }
    }
doesn't trigger anymore.

The proposed patch allows to fix that in a more principled manner,
because we can simply check that no "important" records have been
emitted since the last checkpoint, and skip if that's the case.


> What issue is that? Previously you said it must not skip it at all for
> logical.

It's fine to skip the records iff nothing important has happened since
the last time a snapshot has been logged. Again, the proposed approach
allowed to detect that.


> > We now log more WAL with
> > XLogArchiveTimeout > 0 than without.

> And the problem with that is what?

That an idle system unnecessarily produces WAL? Waking up disks and
everything?


> I'm not much concerned with what emotive language you choose to support
> your arguments

Err.  You're side-tracking the discussion.

Greetings,

Andres Freund


Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Robert Haas
Date:
On Wed, Apr 6, 2016 at 4:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> FWIW, I vote also for reverting this patch. This has been committed
>> without any real discussions..
>
> Michael, its a shame to hear you say that, so let me give full context.
>
> The patches under review in the CF are too invasive and not worth the
> trouble for such a minor problem. After full review, I would simply reject
> those patches (already discussed on list).
>
> Rather than take that option, I went to the trouble of writing a patch that
> does the same thing but simpler, less invasive and more maintainable.
> Primarily, I did that for you, to avoid you having wasted your time and to
> allow you to backpatch a solution.
>
> We can, if you wish, revert this patch. If we do, we will have nothing,
> since I object to the other patch(es).

I don't think you have an absolute veto over other patches, though you
certainly have the right to object, and you certainly don't have to
commit them yourself.  But even more than that, the fact that you
don't like those other patches does not mean that you can commit
something without discussion.  Even if every argument you are making
here is correct, which I'm not sure about, other people obviously
don't think so.  That stuff should be worked out, as far as possible,
in pre-commit review, which is only possible when you post the patch
before committing it.  I think it is fine to commit things
occasionally without posting them ahead of time if they are obviously
uncontroversial, but that isn't the case here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Simon Riggs
Date:
On 6 April 2016 at 12:24, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Apr 6, 2016 at 4:18 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> FWIW, I vote also for reverting this patch. This has been committed
>> without any real discussions..
>
> Michael, its a shame to hear you say that, so let me give full context.
>
> The patches under review in the CF are too invasive and not worth the
> trouble for such a minor problem. After full review, I would simply reject
> those patches (already discussed on list).
>
> Rather than take that option, I went to the trouble of writing a patch that
> does the same thing but simpler, less invasive and more maintainable.
> Primarily, I did that for you, to avoid you having wasted your time and to
> allow you to backpatch a solution.
>
> We can, if you wish, revert this patch. If we do, we will have nothing,
> since I object to the other patch(es).

I don't think you have an absolute veto over other patches

Huh? My understanding is I have the same powers as other committers, no more but also, no less. If you've seen me claim otherwise, please point where that happened.

Me saying "I object" seems to attract more attention than others for some reason. Why is it a discussion point that I object to a patch, whereas if you do it, thats fine?

, though you
certainly have the right to object, and you certainly don't have to
commit them yourself.  But even more than that, the fact that you
don't like those other patches does not mean that you can commit
something without discussion.  Even if every argument you are making
here is correct, which I'm not sure about, other people obviously
don't think so.  That stuff should be worked out, as far as possible,
in pre-commit review, which is only possible when you post the patch
before committing it.  I think it is fine to commit things
occasionally without posting them ahead of time if they are obviously
uncontroversial, but that isn't the case here.

All very strange. People commit changes they didn't post all the time, especially on minor bugs such as this.

Obviously if I knew that it would be controversial I would not have done it. We are discussing it now, so I don't see any problem.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Simon Riggs
Date:
On 6 April 2016 at 10:09, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-06 10:04:42 +0100, Simon Riggs wrote:
> On 6 April 2016 at 09:45, Andres Freund <andres@anarazel.de> wrote:
>
> > On 2016-04-06 09:18:54 +0100, Simon Riggs wrote:
> > > Rather than take that option, I went to the trouble of writing a patch
> > that
> > > does the same thing but simpler, less invasive and more maintainable.
> > > Primarily, I did that for you, to avoid you having wasted your time and
> > to
> > > allow you to backpatch a solution.
> >
> > But it doesn't. It doesn't solve the longstanding problem of checkpoints
> > needlessly being repeated due to standby snapshots.

> <sigh> I can't see why you say this. I am willing to listen, but this
> appears to be wrong.

The issue there is that we continue to issue checkpoints if the only
activity since the last checkpoint was emitting a standby
snapshot. That's because:

I agree this is the current situation in 9.4 and 9.5, hence the bug report.

This no longer occurs with the patches I have proposed. The snapshot is skipped, so a further checkpoint never triggers.
 
The proposed patch allows to fix that in a more principled manner,
because we can simply check that no "important" records have been
emitted since the last checkpoint, and skip if that's the case.

I understand the proposal you have made. The patch to implement it is what I object to; my comments made last Sunday.
 
> What issue is that? Previously you said it must not skip it at all for
> logical.

It's fine to skip the records iff nothing important has happened since
the last time a snapshot has been logged. Again, the proposed approach
allowed to detect that.

Agreed, both proposals do that.
 
> > We now log more WAL with
> > XLogArchiveTimeout > 0 than without.

> And the problem with that is what?

That an idle system unnecessarily produces WAL? Waking up disks and
everything?

The OP discussed a problem with archive_timeout > 0. Are you saying we should put in a fix that applies whatever the setting of archive_timeout?

Are we concerned that a master sends a small amount of data every few minutes to a standby it is connected to? I hadn't read that in the thread.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Robert Haas
Date:
On Wed, Apr 6, 2016 at 8:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > We can, if you wish, revert this patch. If we do, we will have nothing,
>> > since I object to the other patch(es).
>>
>> I don't think you have an absolute veto over other patches
>
> Huh? My understanding is I have the same powers as other committers, no more
> but also, no less. If you've seen me claim otherwise, please point where
> that happened.

Uh, that would be in the portion that is still quoted.  "If we do, we
will have nothing, since I object to the other patches."

> Me saying "I object" seems to attract more attention than others for some
> reason. Why is it a discussion point that I object to a patch, whereas if
> you do it, thats fine?

You have every right to object to the patch.  You don't have a right,
nor do I, to say that it won't be committed without your agreement.

> All very strange. People commit changes they didn't post all the time,
> especially on minor bugs such as this.

No, they really don't.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Andres Freund
Date:
On 2016-04-06 13:11:40 +0100, Simon Riggs wrote:
> On 6 April 2016 at 10:09, Andres Freund <andres@anarazel.de> wrote:
> > On 2016-04-06 10:04:42 +0100, Simon Riggs wrote:
> > The issue there is that we continue to issue checkpoints if the only
> > activity since the last checkpoint was emitting a standby
> > snapshot. That's because:
> >
>
> I agree this is the current situation in 9.4 and 9.5, hence the bug report.
>
> This no longer occurs with the patches I have proposed. The snapshot is
> skipped, so a further checkpoint never triggers.

Not if there's a longrunning/idle transaction.

Note that skipping the snapshot is actually a *problem* in some
cases. As I've brought up upthread, to which you never replied. A
xl_running_xacts->xcnt == 0/!overflowed snapshot can be very important
for hot standby, because it allows ProcArrayApplyRecoveryInfo() to
switch to INITIALIZED state:
    if (standbyState == STANDBY_SNAPSHOT_PENDING)
    {
        /*
         * If the snapshot isn't overflowed or if its empty we can reset our
         * pending state and use this snapshot instead.
         */
        if (!running->subxid_overflow || running->xcnt == 0)
        {
            /*
             * If we have already collected known assigned xids, we need to
             * throw them away before we apply the recovery snapshot.
             */
            KnownAssignedXidsReset();
            standbyState = STANDBY_INITIALIZED;
        }



> > > What issue is that? Previously you said it must not skip it at all for
> > > logical.
> >
> > It's fine to skip the records iff nothing important has happened since
> > the last time a snapshot has been logged. Again, the proposed approach
> > allowed to detect that.

> Agreed, both proposals do that.

No, the currently committed patch, even including your proposed followup
patch, doesn't do that. As you just commented about as quoted above. The
code currently reads like:

    if (wal_level < WAL_LEVEL_LOGICAL)
    {
        LWLockRelease(ProcArrayLock);

        /*
         * Don't bother to log anything if nothing is happening, if we are
         * using archive_timeout > 0 and we didn't overflow snapshot last time.
         *
         * This ensures that we don't issue an empty WAL record, which can
         * be annoying when used in conjunction with archive timeout.
         */
        if (running->xcnt == 0 &&
            nlocks == 0 &&
            XLogArchiveTimeout > 0 &&
            !last_snapshot_overflowed)
        {
            LWLockRelease(XidGenLock);
            return InvalidXLogRecPtr;
        }

        last_snapshot_overflowed = running->subxid_overflow;
    }

This obviously doesn't apply to WAL_LEVEL_LOGICAL as is (the if). And it
also obviously repeats to log the same snapshot in a system where the
state hasn't changed, but where running->xcnt != 0 or nlocks != 0.

> > > > We now log more WAL with
> > > > XLogArchiveTimeout > 0 than without.
> >
> > > And the problem with that is what?
> >
> > That an idle system unnecessarily produces WAL? Waking up disks and
> > everything?
> >
>
> The OP discussed a problem with archive_timeout > 0. Are you saying we
> should put in a fix that applies whatever the setting of archive_timeout?

Yes. We should strive to fix the full scope of an issue; not just the
part complained about.


You seem to ignore valid points made against your approach, apparently
because you dismiss them as "emotive language". Stop.


Greetings,

Andres Freund


Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Michael Paquier
Date:
On Wed, Apr 6, 2016 at 9:27 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-04-06 13:11:40 +0100, Simon Riggs wrote:
>> On 6 April 2016 at 10:09, Andres Freund <andres@anarazel.de> wrote:
>> > On 2016-04-06 10:04:42 +0100, Simon Riggs wrote:
>> > The issue there is that we continue to issue checkpoints if the only
>> > activity since the last checkpoint was emitting a standby
>> > snapshot. That's because:
>> >
>>
>> I agree this is the current situation in 9.4 and 9.5, hence the bug report.
>>
>> This no longer occurs with the patches I have proposed. The snapshot is
>> skipped, so a further checkpoint never triggers.
>
> Not if there's a longrunning/idle transaction.
>
> Note that skipping the snapshot is actually a *problem* in some
> cases. As I've brought up upthread, to which you never replied. A
> xl_running_xacts->xcnt == 0/!overflowed snapshot can be very important
> for hot standby, because it allows ProcArrayApplyRecoveryInfo() to
> switch to INITIALIZED state:
>         if (standbyState == STANDBY_SNAPSHOT_PENDING)
>         {
>                 /*
>                  * If the snapshot isn't overflowed or if its empty we can reset our
>                  * pending state and use this snapshot instead.
>                  */
>                 if (!running->subxid_overflow || running->xcnt == 0)
>                 {
>                         /*
>                          * If we have already collected known assigned xids, we need to
>                          * throw them away before we apply the recovery snapshot.
>                          */
>                         KnownAssignedXidsReset();
>                         standbyState = STANDBY_INITIALIZED;
>                 }

Yes. Such snapshot allows to initialize more quickly a hot standby...
And that's useful in some cases if the recovery is on a pending
snapshot (bgwriter standby snapshots help a lot with that btw).

>> > > > We now log more WAL with
>> > > > XLogArchiveTimeout > 0 than without.
>> >
>> > > And the problem with that is what?
>> >
>> > That an idle system unnecessarily produces WAL? Waking up disks and
>> > everything?
>> >
>>
>> The OP discussed a problem with archive_timeout > 0. Are you saying we
>> should put in a fix that applies whatever the setting of archive_timeout?
>
> Yes. We should strive to fix the full scope of an issue; not just the
> part complained about.
>
> You seem to ignore valid points made against your approach, apparently
> because you dismiss them as "emotive language". Stop.

Yeah... We have reached a clear consensus about the way things should
be done after quite a lot of discussions that has gone for a couple of
months. And Andres' design on the matter is what is getting approval
from everybody who has worked toward addressing this issue while
really taking care of the problem at its root. The patch, as currently
proposed, is a bandaid, and has been committed at the surprise of
everybody without any discussion.

Please let's revert this patch and really move toward fixing this
problem... Which is a way in short a way to fix the decision-making
for checkpoint skipping.
--
Michael


Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Simon Riggs
Date:
On 6 April 2016 at 13:27, Andres Freund <andres@anarazel.de> wrote:
On 2016-04-06 13:11:40 +0100, Simon Riggs wrote:
> On 6 April 2016 at 10:09, Andres Freund <andres@anarazel.de> wrote:
> > On 2016-04-06 10:04:42 +0100, Simon Riggs wrote:
> > The issue there is that we continue to issue checkpoints if the only
> > activity since the last checkpoint was emitting a standby
> > snapshot. That's because:
> >
>
> I agree this is the current situation in 9.4 and 9.5, hence the bug report.
>
> This no longer occurs with the patches I have proposed. The snapshot is
> skipped, so a further checkpoint never triggers.

Not if there's a longrunning/idle transaction.

Note that skipping the snapshot is actually a *problem* in some
cases. As I've brought up upthread, to which you never replied. A
xl_running_xacts->xcnt == 0/!overflowed snapshot can be very important
for hot standby, because it allows ProcArrayApplyRecoveryInfo() to
switch to INITIALIZED state:

I replied by posting a patch to address your concern, how is that non-reply?
 
> > > What issue is that? Previously you said it must not skip it at all for
> > > logical.
> >
> > It's fine to skip the records iff nothing important has happened since
> > the last time a snapshot has been logged. Again, the proposed approach
> > allowed to detect that.

> Agreed, both proposals do that.

No, the currently committed patch, even including your proposed followup
patch, doesn't do that. As you just commented about as quoted above. The
code currently reads like:

        if (wal_level < WAL_LEVEL_LOGICAL)
        {
                LWLockRelease(ProcArrayLock);

                /*
                 * Don't bother to log anything if nothing is happening, if we are
                 * using archive_timeout > 0 and we didn't overflow snapshot last time.
                 *
                 * This ensures that we don't issue an empty WAL record, which can
                 * be annoying when used in conjunction with archive timeout.
                 */
                if (running->xcnt == 0 &&
                        nlocks == 0 &&
                        XLogArchiveTimeout > 0 &&
                        !last_snapshot_overflowed)
                {
                        LWLockRelease(XidGenLock);
                        return InvalidXLogRecPtr;
                }

                last_snapshot_overflowed = running->subxid_overflow;
        }

This obviously doesn't apply to WAL_LEVEL_LOGICAL as is (the if). And it
also obviously repeats to log the same snapshot in a system where the
state hasn't changed, but where running->xcnt != 0 or nlocks != 0.

My understanding from your previous comments was that it would be incorrect to do that.
 
> > > > We now log more WAL with
> > > > XLogArchiveTimeout > 0 than without.
> >
> > > And the problem with that is what?
> >
> > That an idle system unnecessarily produces WAL? Waking up disks and
> > everything?
> >
>
> The OP discussed a problem with archive_timeout > 0. Are you saying we
> should put in a fix that applies whatever the setting of archive_timeout?

Yes. We should strive to fix the full scope of an issue; not just the
part complained about.

I believe that's what I did. You didn't mention that point earlier, nor do I now think it important.
 
You seem to ignore valid points made against your approach, apparently
because you dismiss them as "emotive language". Stop.

Not true. I have listened to everything you've said and been patient with the high number of mistakes in your replies.

Calling something a "bandaid" is not a "valid point" against it.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Simon Riggs
Date:
On 6 April 2016 at 13:48, Michael Paquier <michael.paquier@gmail.com> wrote:

Yeah... We have reached a clear consensus about the way things should
be done after quite a lot of discussions that has gone for a couple of
months. And Andres' design on the matter is what is getting approval
from everybody who has worked toward addressing this issue while
really taking care of the problem at its root.

Clearly I am not included in your use of the words "we" and "everybody" !

What you mean is that you and Andres agree, so using the word consensus is not appropriate and certainly not clear. We did all agree on the point that the earlier fix cannot be backpatched, so if it is as grevious a problem as described many users will not now benefit.

I will revert my earlier patch now.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Andres Freund
Date:
On 2016-04-06 13:50:24 +0100, Simon Riggs wrote:
> On 6 April 2016 at 13:27, Andres Freund <andres@anarazel.de> wrote:
>
> > On 2016-04-06 13:11:40 +0100, Simon Riggs wrote:
> > > On 6 April 2016 at 10:09, Andres Freund <andres@anarazel.de> wrote:
> > > > On 2016-04-06 10:04:42 +0100, Simon Riggs wrote:
> > > > The issue there is that we continue to issue checkpoints if the only
> > > > activity since the last checkpoint was emitting a standby
> > > > snapshot. That's because:
> > > >
> > >
> > > I agree this is the current situation in 9.4 and 9.5, hence the bug
> > report.
> > >
> > > This no longer occurs with the patches I have proposed. The snapshot is
> > > skipped, so a further checkpoint never triggers.
> >
> > Not if there's a longrunning/idle transaction.
> >
> > Note that skipping the snapshot is actually a *problem* in some
> > cases. As I've brought up upthread, to which you never replied. A
> > xl_running_xacts->xcnt == 0/!overflowed snapshot can be very important
> > for hot standby, because it allows ProcArrayApplyRecoveryInfo() to
> > switch to INITIALIZED state:
> >
>
> I replied by posting a patch to address your concern, how is that non-reply?

It doesn't address the problem? It's irrelevant that the last snapshot
had 0 xacts, if you start recovery from a later check/restartpoint;
recovery won't process earlier running_xacts records.


> > This obviously doesn't apply to WAL_LEVEL_LOGICAL as is (the if). And it
> > also obviously repeats to log the same snapshot in a system where the
> > state hasn't changed, but where running->xcnt != 0 or nlocks != 0.

> My understanding from your previous comments was that it would be incorrect
> to do that.

I said:
> For one it breaks cleanup with logical decoding which does *NEED* to
> know that nothing is happening. Although only once, not repeatedly.

The salient point is "Although only once, not repeatedly.". Which is
pretty much same thing as for HS; to become consistent after a
checkpoint.


> Not true. I have listened to everything you've said and been patient with
> the high number of mistakes in your replies.

Simon, this is utterly ridiculous. Missing an if in a post-commit
review, of a hastily committed patch, which hasn't previously been
posted for review, is entirely normal.

Greetings,

Andres Freund


Re: [COMMITTERS] pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server

From
Andres Freund
Date:
On 2016-04-06 14:00:20 +0100, Simon Riggs wrote:
> On 6 April 2016 at 13:48, Michael Paquier <michael.paquier@gmail.com> wrote:
>
> >
> > Yeah... We have reached a clear consensus about the way things should
> > be done after quite a lot of discussions that has gone for a couple of
> > months. And Andres' design on the matter is what is getting approval
> > from everybody who has worked toward addressing this issue while
> > really taking care of the problem at its root.

> Clearly I am not included in your use of the words "we" and "everybody" !

Uh. You're not serious, right? Obviously that refers to the months of
discussions that already had happened. Where you didn't participate,
hence obviously weren't included in "we".

Andres