Thread: issue with track_commit_timestamp and server restart

issue with track_commit_timestamp and server restart

From
Julien Rouhaud
Date:
I just noticed that if track_commit_timestamp is enabled, the
oldestCommitTsXid and newestCommitTsXid don't persist after a server
restart, so you can't ask for the commit ts of a transaction that
committed before the last server start, although the information is
still available (unless of course if a freeze occured).  AFAICT it
always behave this way.

I'm not familiar with that code, but it looks like these counters are
supposed to be restored in StartupXLOG(), with a call to
SetCommitTsLimit(). However, this function doesn't store the new value
if ShmemVariableCache->oldestCommitTsXid is InvalidTransactionId, which
is the initial value.

So the counters are initialized on a subsequent call of
ActivateCommitTs(), which does:

    if (ShmemVariableCache->oldestCommitTsXid == InvalidTransactionId)
    {
        ShmemVariableCache->oldestCommitTsXid =
            ShmemVariableCache->newestCommitTsXid = ReadNewTransactionId();
    }

(but doesn't try to cleanup the SLRU directory btw)

leaving any previous transaction unreachable.  Simple attached patch
seems to fix the issue.  I tried on a master and a replica, enabling and
disabling track_commit_timestamp, and everything seemed to work as intended.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Attachment

Re: issue with track_commit_timestamp and server restart

From
Craig Ringer
Date:
On 22 October 2016 at 19:51, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:
> I just noticed that if track_commit_timestamp is enabled, the
> oldestCommitTsXid and newestCommitTsXid don't persist after a server
> restart, so you can't ask for the commit ts of a transaction that
> committed before the last server start, although the information is
> still available (unless of course if a freeze occured).  AFAICT it
> always behave this way.

I initially could'n't see this when tested on 8f1fb7d with a
src/test/recovery/t test script. But it turns out it's OK on immediate
shutdown and crash recovery, but not on clean shutdown and restart.

The attached patch adds a TAP test to src/test/recovery to show this.
If you run the TAP test before recompiling with the fix it'll fail.
"make" to apply the fix, then re-run and it'll succeed. Or just
temporarily roll back the fix with:

    git checkout HEAD^1 -- src/backend/access/transam/commit_ts.c
    git reset src/backend/access/transam/commit_ts.c

and rebuild to see it fail.

To save time running the recovery suite, just

   rm src/test/recovery/00[0-8]*.pl

(It'd be nice to have a prove_check target to run just one test file).

This would explain a few issues I've seen reported with BDR from the
community which I've so far been unable to reproduce, so thanks very
much for the report.

Álvaro, would you mind checking this and committing to HEAD and 9.6?
The commits.c change only should also be committed to 9.5, but not the
TAP test.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: issue with track_commit_timestamp and server restart

From
Michael Paquier
Date:
(thread hijacking)

On Mon, Oct 24, 2016 at 1:58 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> To save time running the recovery suite, just
>
>    rm src/test/recovery/00[0-8]*.pl
>
> (It'd be nice to have a prove_check target to run just one test file).

Agreed! Or multiple chosen files. I usually remove those files
manually from time to time when working on a fix just to run one
dedicated test repeatedly.
-- 
Michael



Re: issue with track_commit_timestamp and server restart

From
Craig Ringer
Date:
On 24 October 2016 at 12:58, Craig Ringer <craig@2ndquadrant.com> wrote:

> The attached patch adds a TAP test to src/test/recovery to show this.

Added to CF.

https://commitfest.postgresql.org/11/834/

... but IMO this should go in the next bugfix release.

I've also applied nearly the same fix for the same bug in the original
commit timestamp implementation in the BDR Postgres tree.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: issue with track_commit_timestamp and server restart

From
Julien Rouhaud
Date:
On 24/10/2016 06:58, Craig Ringer wrote:
> On 22 October 2016 at 19:51, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:
>> I just noticed that if track_commit_timestamp is enabled, the
>> oldestCommitTsXid and newestCommitTsXid don't persist after a server
>> restart, so you can't ask for the commit ts of a transaction that
>> committed before the last server start, although the information is
>> still available (unless of course if a freeze occured).  AFAICT it
>> always behave this way.
> 
> I initially could'n't see this when tested on 8f1fb7d with a
> src/test/recovery/t test script. But it turns out it's OK on immediate
> shutdown and crash recovery, but not on clean shutdown and restart.
> 
> The attached patch adds a TAP test to src/test/recovery to show this.
> If you run the TAP test before recompiling with the fix it'll fail.
> "make" to apply the fix, then re-run and it'll succeed. Or just
> temporarily roll back the fix with:
> 
>     git checkout HEAD^1 -- src/backend/access/transam/commit_ts.c
>     git reset src/backend/access/transam/commit_ts.c
> 
> and rebuild to see it fail.
> 
> To save time running the recovery suite, just
> 
>    rm src/test/recovery/00[0-8]*.pl
> 
> (It'd be nice to have a prove_check target to run just one test file).
> 
> This would explain a few issues I've seen reported with BDR from the
> community which I've so far been unable to reproduce, so thanks very
> much for the report.
> 
> Álvaro, would you mind checking this and committing to HEAD and 9.6?
> The commits.c change only should also be committed to 9.5, but not the
> TAP test.
> 

Thanks a lot for the review, and adding the tests!


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org



Re: issue with track_commit_timestamp and server restart

From
Alvaro Herrera
Date:
Craig Ringer wrote:

> I initially could'n't see this when tested on 8f1fb7d with a
> src/test/recovery/t test script. But it turns out it's OK on immediate
> shutdown and crash recovery, but not on clean shutdown and restart.

Oh, oops.

> The attached patch adds a TAP test to src/test/recovery to show this.

I moved it to src/test/modules/commit_ts and pushed.  src/test/recovery
is not currently run by buildfarm, except hamster, unless the bf client
has been patched since I last looked.  But that's not the reason I moved
it, but rather because the other location seemed better.  I also fixed
the wrongly pasted comment at the top of the file while at it.

If this disrupts the buildfarm in any way I will just revert it
immediately.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: issue with track_commit_timestamp and server restart

From
Alvaro Herrera
Date:
Now let's discuss a release note entry for 9.5 and 9.6, shall we?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: issue with track_commit_timestamp and server restart

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Now let's discuss a release note entry for 9.5 and 9.6, shall we?

The text in your commit message seems sufficient from here.
On it now (although if somebody doesn't fix anon git PDQ, we're
not releasing today anyway).
        regards, tom lane