Thread: Do I understand commit timestamps correctly?

Do I understand commit timestamps correctly?

From

Chapman Flack

Date:

23 March 2018, 21:29:35

Hi,

Can somebody confirm or correct what I (think I)'ve gleaned from
the code?

- Commit timestamps are always WAL logged, and so in principle
  determinable after the fact (with some amount of effort), regardless
  of the track_commit_timestamp setting. (I guess this must have long
  been true, to support recovery_target_time.)

- The extra machinery turned on by track_commit_timestamp maintains
  a cache of recent ones so they can be efficiently queried from SQL
  in normal operation.

? Given a base backup and a bunch of WAL from a cluster that had
  track_commit_timestamps turned off, is it possible (in principle?)
  to do a PITR with the switch turned on, and have the commit_ts
  cache get populated (at least from the transactions encountered
  in the WAL)? Could that be done by changing the setting in
  postgresql.conf for the recovery, or would it take something more
  invasive, like poking the value in pg_control? Or would that just
  make something fall over? Would it require dummying up some commit_ts
  files first?

Thanks,
-Chap

Re: Do I understand commit timestamps correctly?

From

Alvaro Herrera

Date:

23 March 2018, 21:40:37

Chapman Flack wrote:
> Hi,
> 
> Can somebody confirm or correct what I (think I)'ve gleaned from
> the code?
> 
> - Commit timestamps are always WAL logged, and so in principle
>   determinable after the fact (with some amount of effort), regardless
>   of the track_commit_timestamp setting. (I guess this must have long
>   been true, to support recovery_target_time.)

Right.

> - The extra machinery turned on by track_commit_timestamp maintains
>   a cache of recent ones so they can be efficiently queried from SQL
>   in normal operation.

Yes.

> ? Given a base backup and a bunch of WAL from a cluster that had
>   track_commit_timestamps turned off, is it possible (in principle?)
>   to do a PITR with the switch turned on, and have the commit_ts
>   cache get populated (at least from the transactions encountered
>   in the WAL)? Could that be done by changing the setting in
>   postgresql.conf for the recovery, or would it take something more
>   invasive, like poking the value in pg_control? Or would that just
>   make something fall over? Would it require dummying up some commit_ts
>   files first?

I don't remember if this is explicitly supported, but yeah AFAIR it
should work to just start the "promoted standby" (in your case just a
recovered backup) on after setting the option in postgresql.conf.  This
is because StartupCommitTs() activates the commit_ts module just before
starting recovery.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Do I understand commit timestamps correctly?

From

Chapman Flack

Date:

13 May 2020, 07:28:20

On 03/23/18 11:40, Alvaro Herrera wrote:
> Chapman Flack wrote:
>> ? Given a base backup and a bunch of WAL from a cluster that had
>>   track_commit_timestamps turned off, is it possible (in principle?)
>>   to do a PITR with the switch turned on, and have the commit_ts
>>   cache get populated (at least from the transactions encountered
>>   in the WAL)? Could that be done by changing the setting in
>>   postgresql.conf for the recovery, or would it take something more
>>   invasive, like poking the value in pg_control? Or would that just
>>   make something fall over? Would it require dummying up some commit_ts
>>   files first?
> 
> I don't remember if this is explicitly supported, but yeah AFAIR it
> should work to just start the "promoted standby" (in your case just a
> recovered backup) on after setting the option in postgresql.conf.  This
> is because StartupCommitTs() activates the commit_ts module just before
> starting recovery.

Getting around to trying it out, simply changing the setting in
postgresql.conf before starting the server does not seem sufficient:
once it comes online, it has track_commit_timestamp on, but has not
populated the cache from transactions it applied during recovery.

On the other hand, changing the setting in postgresql.conf *and*
poking a 1 in the track_commit_timestamp byte in pg_control,
and fudging the CRC accordingly, *then* starting the server, does
seem to do just as I had hoped. Nothing seems to complain or fall over,
and the transactions recovered from WAL now have timestamps visible
with pg_xact_commit_timestamp().

Regards,
-Chap

Re: Do I understand commit timestamps correctly?

From

Alvaro Herrera

Date:

13 May 2020, 18:28:40

On 2020-May-13, Chapman Flack wrote:

> Getting around to trying it out, simply changing the setting in
> postgresql.conf before starting the server does not seem sufficient:
> once it comes online, it has track_commit_timestamp on, but has not
> populated the cache from transactions it applied during recovery.

Ah.  I had not realized that that was what you wanted to do.

> On the other hand, changing the setting in postgresql.conf *and*
> poking a 1 in the track_commit_timestamp byte in pg_control,
> and fudging the CRC accordingly, *then* starting the server, does
> seem to do just as I had hoped. Nothing seems to complain or fall over,
> and the transactions recovered from WAL now have timestamps visible
> with pg_xact_commit_timestamp().

Nice hack.  I guess you'd sooner have a supported way to enable the bit
in pg_control.  Is there a use case for this?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

document? Re: Do I understand commit timestamps correctly?

From

Chapman Flack

Date:

14 May 2020, 00:17:42

On 03/23/18 11:29, Chapman Flack wrote:
> Can somebody confirm or correct what I (think I)'ve gleaned from
> the code?
> ...
> - The extra machinery turned on by track_commit_timestamp maintains
>   a cache of recent ones so they can be efficiently queried from SQL
>   in normal operation.

I guess the depth of this cache is determined by the setting of
autovacuum_freeze_max_age. This is covered (including arithmetic
for sizing the cache) in a paragraph part-way down the
vacuum-for-wraparound sect2 in the maintenance chapter.

It isn't mentioned at the description of track_commit_timestamp,
pg_xact_commit_timestamp(), or autovacuum_freeze_max_age.

Should it be?

A minimalist doc addition could be to add a link from the description
of track_commit_timestamp to that of autovacuum_freeze_max_age.

-> The autovacuum_freeze_max_age description already has a link
to the maintenance section (but could stand a few words added saying
it also matters for commit timestamps).

The description for pg_xact_commit_timestamp() and pg_last_committed_xact()
already has a link to track_commit_timestamp (and as that defaults to off,
anyone interested in the functionality likely to follow the link, so the
trail of breadcrumbs would be complete). The language "only for transactions
that were committed after it was enabled" could perhaps also say "and
have not yet been frozen".

I could send a patch if this seems worth documenting.

Regards,
-Chap