Re: tracking commit timestamps - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: tracking commit timestamps
Date
Msg-id CAB7nPqTW3LfHvhzw3TXOh6JMHnTUhSLcK85+vqUNrvkTeoD7ag@mail.gmail.com
Whole thread Raw
In response to Re: tracking commit timestamps  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: tracking commit timestamps
Re: tracking commit timestamps
List pgsql-hackers


On Sat, Nov 1, 2014 at 10:00 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
More comments:
 
I have done a couple of tests on my laptop with pgbench like that to generate a maximum of transaction commits:
$ pgbench --no-vacuum -f ~/Desktop/commit.sql -T 60 -c 24 -j 24
$ cat ~/Desktop/commit.sql
SELECT txid_current()
Here is an average of 5 runs:
- master: 49842.44
- GUC off, patched = 49688.71
- GUC on, patched = 49459.73
So there is little noise.

Here are also more comments about the code that I found while focusing on committs.c:
1) When the GUC is not enabled, and when the transaction ID provided is not in a correct range, a dummy value based on InvalidTransactionId (?!) is returned, like that:
+       /* Return empty if module not enabled */
+       if (!commit_ts_enabled)
+       {
+               if (ts)
+                       *ts = InvalidTransactionId;
+               if (data)
+                       *data = (CommitExtraData) 0;
+               return;
+       }
This leads to some incorrect results:
=# select pg_get_transaction_committime('1');
 pg_get_transaction_committime
-------------------------------
 2000-01-01 09:00:00+09
(1 row)
Or for example:
=# SELECT txid_current();
 txid_current
--------------
         1006
(1 row)
=# select pg_get_transaction_committime('1006');
 pg_get_transaction_committime
-------------------------------
 2014-11-04 14:54:37.589381+09
(1 row)
=# select pg_get_transaction_committime('1007');
 pg_get_transaction_committime
-------------------------------
 2000-01-01 09:00:00+09
(1 row)
=# SELECT txid_current();
 txid_current
--------------
         1007
(1 row)
And at other times error is not that helpful for user:
=# select pg_get_transaction_committime('10000');
ERROR:  XX000: could not access status of transaction 10000
DETAIL:  Could not read from file "pg_committs/0000" at offset 114688: Undefined error: 0.
LOCATION:  SlruReportIOError, slru.c:896
I think that you should simply return an error in TransactionIdGetCommitTsData when xid is not in the range supported. That will be more transparent for the user.
2) You may as well want to return a different error if the GUC track_commit_timestamps is disabled.
3) This comment should be updated in committs.c, we are not talking about CLOG here:
+/*
+ * Link to shared-memory data structures for CLOG control
+ */
4) Similarly, I think more comments should be put in here. It is OK to truncate the commit timestamp stuff similarly to CLOG to have a consistent status context available, but let's explain it.
         * multixacts; that will be done by the next checkpoint.
         */
        TruncateCLOG(frozenXID);
+       TruncateCommitTs(frozenXID)
5) Reading the code, TransactionTreeSetCommitTimestamp is always called with do_xlog = false, making that actually no timestamps are WAL'd... Hence WriteSetTimestampXlogRec is just dead code with the latest version of the patch. IMO, this should be always WAL-logged when track_commit_timestamp is on.
6) Shouldn't any value update of track_commit_timestamp be tracked in XLogReportParameters? That's thinking about making the commit timestamp available on standbys as well..
7) pg_xlogdump has no support for RM_COMMITTS_ID, something that would be useful for developers.
8) The redo and xlog routines of committs should be out in a dedicated file, like committsxlog.c or similar. The other rmgr do so, so let's be consistent.

Regards,
--
Michael

pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Repeatable read and serializable transactions see data committed after tx start
Next
From: Andres Freund
Date:
Subject: Re: tracking commit timestamps