Thread: Surprising results from tests intended to show the difference in semantics between transaction_timestamp(), statement_timestamp(), and clock_timestamp()

Surprising results from tests intended to show the difference in semantics between transaction_timestamp(), statement_timestamp(), and clock_timestamp()

From

Bryn Llewellyn

Date:

22 September 2021, 02:23:22

I'm surprised by the results that I describe below. Please help me interpret them.

Briefly, why does transaction_timestamp() report a later value than statement_timestamp() when they're both invoked in the very first statement after "start transaction". (They report identical values in an implicitly started txn.)
— — — — — — — — — — — — — — — — — — — — — — — — —

*Introduction*

I'm presently documenting (in the YugabyteDB YSQL section) the various PG date-time functions that report the "current" moment. There are two degrees of freedom: the return data type; and the kind of moment. This would suggest that there's eighteen differently-named functions (considering also timetz—which the PG doc recommends against using). There are, in fact, far fewer. Never mind, timestamptz is almost always the best choice for the data type of the moment; and the three functions that I name in my "subject" return that data type.

I thought that this demo would be simple and convincing:

-- SINGLE STATEMENT TEST (IMPLICIT TXN)

select
  transaction_timestamp() as txn_11,
  statement_timestamp()   as stm_11,
  clock_timestamp()     as clk_11,
  pg_sleep(:slp_1)   as dummy,
  transaction_timestamp() as txn_12,
  statement_timestamp()   as stm_12,
  clock_timestamp()     as clk_12;

-- TWO STATEMENT TEST (EXPLICIT TXN)

start transaction;

select
  transaction_timestamp() as txn_11,
  statement_timestamp()   as stm_11,
  clock_timestamp()     as clk_11,
  pg_sleep(:slp_1)   as dummy,
  transaction_timestamp() as txn_12,
  statement_timestamp()   as stm_12,
  clock_timestamp()     as clk_12;

select pg_sleep(:slp_2);

select
  transaction_timestamp() as txn_2,
  statement_timestamp()   as stm_2,
  clock_timestamp()     as clk_2;

commit;

The output from this takes a huge mental effort to comprehend. So I wrote a naive reporting function:

function expected_tstz_diff_error(
  caption   in text,
  t_finish     in timestamptz,
  t_start   in timestamptz,
  expected_diff  in double precision)
  returns     text

I test what I get like this:

-- SINGLE STATEMENT TEST (IMPLICIT TXN)
select expected_tstz_diff_error('txn_12 - txn_11 = 0',   :s1_txn_12, :s1_txn_12, :zero);
select expected_tstz_diff_error('stm_12 - stm_11 = 0',   :s1_stm_12, :s1_stm_12, :zero);
select expected_tstz_diff_error('txn_11 - stm_11 = 0',   :s1_txn_11, :s1_stm_11, :zero);
select expected_tstz_diff_error('clk_12 - clk_11 = slp_1',   :s1_clk_12, :s1_clk_11, :slp_1);

and this:

-- TWO STATEMENT TEST (EXPLICIT TXN)

-- Within the first "select".
select expected_tstz_diff_error('txn_12 - txn_11 = 0',   :s1_txn_12, :s1_txn_12, :zero);
select expected_tstz_diff_error('stm_12 - stm_11 = 0',   :s1_stm_12, :s1_stm_12, :zero);
select expected_tstz_diff_error('txn_11 - stm_11 = 0',   :s1_txn_11, :s1_stm_11, :zero);
select expected_tstz_diff_error('clk_12 - clk_11 = slp_1',   :s1_clk_12, :s1_clk_11, :slp_1);

-- Within the trasaction but across the first and the second "select".
select expected_tstz_diff_error('txn_2 - txn_11 = 0',     :s2_txn_2,  :s1_txn_11, :zero);
select expected_tstz_diff_error('stm_2 - stm_11 = slp_1 + slp_2', :s2_stm_2,  :s1_stm_11, (:slp_1 + :slp_2));
select expected_tstz_diff_error('clk_2 - clk_11 = slp_1 + slp_2', :s2_clk_2,  :s1_clk_11, (:slp_1 + :slp_2));

The function reports either "exact" or the error in milliseconds.

The runnable code is copied at the very end, after my account of the problem. You can guess that there's a malarkey of quoting with \gset to populate all those psql "substitution variables". But it all works. The design concept is informed by these ideas:

— I can't use a PL/pgSQL encapsulation because, with this, it's all regarded as a single top-level statement and so I can't easily demonstrate the difference in semantics between transaction_timestamp() and statement_timestamp().

— I want to get my results back with the minimum of time-consuming fuss to get the most reliable timings. So I do the reporting after I have all the results to hand in psql "substitution variables".

*Results*

Here's what I get with my PG Version 13.4 installation on my MacBook. (I removed blank lines to improve the readability.)

SINGLE STATEMENT TEST (IMPLICIT TXN)

txn_12 - txn_11 = 0   : exact
stm_12 - stm_11 = 0   : exact
txn_11 - stm_11 = 0   : exact
clk_12 - clk_11 = slp_1   : error:  0.1 ms

TWO STATEMENT TEST (EXPLICIT TXN)

txn_12 - txn_11 = 0   : exact
stm_12 - stm_11 = 0   : exact
txn_11 - stm_11 = 0   : error: -0.1 ms
clk_12 - clk_11 = slp_1   : error:  0.6 ms

txn_2 - txn_11 = 0     : exact
stm_2 - stm_11 = slp_1 + slp_2 : error:  1.6 ms
clk_2 - clk_11 = slp_1 + slp_2 : error:  1.6 ms

*Discussion*

The stand-alone “select" that implicitly starts the txn gives the results that I'd expect.

I expected that, using the explicitly started txn, the values returned by transaction_timestamp() and statement_timestamp() in the first statement would be identical (just as they are with the implicitly started txn) because (as I believe) it's the first substantive SQL statement, and not "set transaction", that actually starts the txn. Else, if "set transaction" really does start the txn, then statement_timestamp() should return a later value than transaction_timestamp(). In fact, it's the other way round.

*Any thoughts on why?*

If I can explain why, with the explicitly started txn, the values from transaction_timestamp() and statement_timestamp() differ in the first "select", then, I suppose, there are no more mysteries in the results from the second "select" because "commit" is bound to take time and I'm not using an estimate for this in my value for "expected_diff".
— — — — — — — — — — — — — — — — — — — — — — — — —

create function expected_tstz_diff_error(
  caption   in text,
  t_finish     in timestamptz,
  t_start   in timestamptz,
  expected_diff  in double precision)
  returns     text
  language plpgsql
as $body$
declare
  dp_zero  constant double precision not null := 0;
  dp_1000  constant double precision not null := 1000;
  err   constant double precision not null :=
( (extract(epoch from t_finish) - extract(epoch from t_start)) - expected_diff )*dp_1000;
  x   constant text not null := rpad(caption, 30)||' : ';
  r     text not null := x||'exact';
begin
  if err <> dp_zero then
  r := x||'error:'||to_char(err, '90.9')||' ms';
  end if;
  return r;
end;
$body$;

-- self-doc
\set AUTOCOMMIT 'on'

\set zero  0
\set slp_1 1
\set slp_2 2

\t on
\o pg-local.txt

-- ===================================================================
select 'SINGLE STATEMENT TEST (IMPLICIT TXN)';
----------------------------------------------------------------------

select
  ''''''''||transaction_timestamp() ::text||'''''''' as txn_11,
  ''''''''||statement_timestamp()   ::text||'''''''' as stm_11,
  ''''''''||clock_timestamp()     ::text||'''''''' as clk_11,
  ''''''''||pg_sleep(:slp_1)   ::text||'''''''' as dummy,
  ''''''''||transaction_timestamp() ::text||'''''''' as txn_12,
  ''''''''||statement_timestamp()   ::text||'''''''' as stm_12,
  ''''''''||clock_timestamp()     ::text||'''''''' as clk_12
\gset s1_

select expected_tstz_diff_error('txn_12 - txn_11 = 0',   :s1_txn_12, :s1_txn_12, :zero);
select expected_tstz_diff_error('stm_12 - stm_11 = 0',   :s1_stm_12, :s1_stm_12, :zero);
select expected_tstz_diff_error('txn_11 - stm_11 = 0',   :s1_txn_11, :s1_stm_11, :zero);
select expected_tstz_diff_error('clk_12 - clk_11 = slp_1',   :s1_clk_12, :s1_clk_11, :slp_1);

-- ===================================================================
select 'TWO STATEMENT TEST (EXPLICIT TXN)';
----------------------------------------------------------------------

start transaction;

select
  ''''''''||transaction_timestamp() ::text||'''''''' as txn_11,
  ''''''''||statement_timestamp()   ::text||'''''''' as stm_11,
  ''''''''||clock_timestamp()     ::text||'''''''' as clk_11,
  ''''''''||pg_sleep(:slp_1)   ::text||'''''''' as dummy,
  ''''''''||transaction_timestamp() ::text||'''''''' as txn_12,
  ''''''''||statement_timestamp()   ::text||'''''''' as stm_12,
  ''''''''||clock_timestamp()     ::text||'''''''' as clk_12
\gset s1_

select pg_sleep(:slp_2);

select
  ''''''''||transaction_timestamp() ::text||'''''''' as txn_2,
  ''''''''||statement_timestamp()   ::text||'''''''' as stm_2,
  ''''''''||clock_timestamp()     ::text||'''''''' as clk_2
\gset s2_

commit;

-- Within the first "select".
select expected_tstz_diff_error('txn_12 - txn_11 = 0',   :s1_txn_12, :s1_txn_12, :zero);
select expected_tstz_diff_error('stm_12 - stm_11 = 0',   :s1_stm_12, :s1_stm_12, :zero);
select expected_tstz_diff_error('txn_11 - stm_11 = 0',   :s1_txn_11, :s1_stm_11, :zero);
select expected_tstz_diff_error('clk_12 - clk_11 = slp_1',   :s1_clk_12, :s1_clk_11, :slp_1);

-- Within the trasaction but across the first and the second "select".
select expected_tstz_diff_error('txn_2 - txn_11 = 0',     :s2_txn_2,  :s1_txn_11, :zero);
select expected_tstz_diff_error('stm_2 - stm_11 = slp_1 + slp_2', :s2_stm_2,  :s1_stm_11, (:slp_1 + :slp_2));
select expected_tstz_diff_error('clk_2 - clk_11 = slp_1 + slp_2', :s2_clk_2,  :s1_clk_11, (:slp_1 + :slp_2));

\o
\t off

Re: Surprising results from tests intended to show the difference in semantics between transaction_timestamp(), statement_timestamp(), and clock_timestamp()

From

"David G. Johnston"

Date:

22 September 2021, 03:10:40

On Tuesday, September 21, 2021, Bryn Llewellyn <bryn@yugabyte.com> wrote:

I'm surprised by the results that I describe below. Please help me interpret them.

Briefly, why does transaction_timestamp() report a later value than statement_timestamp() when they're both invoked in the very first statement after "start transaction". (They report identical values in an implicitly started txn.)

  transaction_timestamp() as txn_11,
  statement_timestamp()   as stm_11,

  t_finish     in timestamptz,
  t_start   in timestamptz,
  expected_diff  in double precision)

select expected_tstz_diff_error('txn_11 - stm_11 = 0',   :s1_txn_11, :s1_stm_11, :zero);

txn_11 - stm_11 = 0   : error: -0.1 ms

err constant double precision not null :=
( (extract(epoch from t_finish) - extract(epoch from t_start)) - expected_diff )*dp_1000;

So, finish is the txn, start is the stm, and expected is 0 - thus if finish (txn) is earlier than start (stm) your error will be negative. Which it is, and since transaction should be earlier than statement this would seem to be correct. In short, you seem to have mis-interpreted the error sign’s meaning.

David J.

Re: Surprising results from tests intended to show the difference in semantics between transaction_timestamp(), statement_timestamp(), and clock_timestamp()

From

Adrian Klaver

Date:

22 September 2021, 03:38:12

On 9/21/21 4:23 PM, Bryn Llewellyn wrote:
> I'm surprised by the results that I describe below. Please help me 
> interpret them.
> 
> Briefly, why does transaction_timestamp() report a later value 
> than statement_timestamp() when they're both invoked in the very first 
> statement after "start transaction". (They report identical values in an 
> implicitly started txn.)

I'm not seeing it:

test(5432)=# begin;

select
   transaction_timestamp() as txn_11,
   statement_timestamp()   as stm_11,
   clock_timestamp()       as clk_11,
   pg_sleep(0.5)        as dummy,
   transaction_timestamp() as txn_12,
   statement_timestamp()   as stm_12,
   clock_timestamp()       as clk_12;

select pg_sleep(1);

select
   transaction_timestamp() as txn_2,
   statement_timestamp()   as stm_2,
   clock_timestamp()       as clk_2;

commit;

BEGIN
-[ RECORD 1 ]-------------------------
txn_11 | 2021-09-21 17:29:18.272042-07
stm_11 | 2021-09-21 17:29:18.2722-07
clk_11 | 2021-09-21 17:29:18.272297-07
dummy  |
txn_12 | 2021-09-21 17:29:18.272042-07
stm_12 | 2021-09-21 17:29:18.2722-07
clk_12 | 2021-09-21 17:29:18.772863-07

-[ RECORD 1 ]
pg_sleep |

-[ RECORD 1 ]------------------------
txn_2 | 2021-09-21 17:29:18.272042-07
stm_2 | 2021-09-21 17:29:19.774436-07
clk_2 | 2021-09-21 17:29:19.77456-07


No matter how many times I run this the transaction_timestamp() is less 
then statement_timestamp().  This is as it should be, as 
transaction_timestamp() fires with the begin; and statement_timestamp() 
fires at select ... and then updates with the subsequent select .... and 
clock_timestamp() just keeps marching on through the whole process.

-- 
Adrian Klaver
adrian.klaver@aklaver.com

Thanks, David! Re: Surprising results from tests intended to show the difference in semantics between transaction_timestamp(), statement_timestamp(), and clock_timestamp()

From

Bryn Llewellyn

Date:

22 September 2021, 03:42:55

David G. Johnston wrote:

Bryn wrote:

I'm surprised by the results that I describe below. Please help me interpret them.

Briefly, why does transaction_timestamp() report a later value than statement_timestamp() when they're both invoked in the very first statement after "start transaction". (They report identical values in an implicitly started txn.)

  transaction_timestamp() as txn_11,
  statement_timestamp()   as stm_11,

  t_finish     in timestamptz,
  t_start   in timestamptz,
  expected_diff  in double precision)

select expected_tstz_diff_error('txn_11 - stm_11 = 0',   :s1_txn_11, :s1_stm_11, :zero);

txn_11 - stm_11 = 0   : error: -0.1 ms

  err   constant double precision not null :=
( (extract(epoch from t_finish) - extract(epoch from t_start)) - expected_diff )*dp_1000;

So, finish is the txn, start is the stm, and expected is 0 - thus if finish (txn) is earlier than start (stm) your error will be negative. Which it is, and since transaction should be earlier than statement this would seem to be correct. In short, you seem to have mis-interpreted the error sign’s meaning.

Thanks, David.

Damn... I'm so embarrassed. I'd been staring at my code for too long and was blind to my stupidity. Just goes to show how important it is to choose names sensibly. "Start" and "finish" were silly (a left-over from a previous use of that code). I changed the names to "t_expected_later" and "t_expected_sooner" — and twizzled my arguments where needed. All is clear now.

In summary, the first statement after the explicit "start transaction" is a bit later than the moment that the explicitly started txn starts—in line with intuition. And, of course, the round trip to invoke pg_sleep() takes time too.

I was talking nonsense when I wrote "commit is bound to take time and I'm not using an estimate for this". Doubtless "commit" does take some time. But it happens after my last recorded time measurement. That, too, was a left-over from a different test. Sigh...

Case closed.

Re: Surprising results from tests intended to show the difference in semantics between transaction_timestamp(), statement_timestamp(), and clock_timestamp()

From

Bryn Llewellyn

Date:

22 September 2021, 03:46:27

Adrian Klaver wrote:

On 9/21/21 4:23 PM, Bryn Llewellyn wrote:
I'm surprised by the results that I describe below. Please help me interpret them.
Briefly, why does transaction_timestamp() report a later value than statement_timestamp() when they're both invoked in the very first statement after "start transaction". (They report identical values in an implicitly started txn.)

I'm not seeing it:

...

No matter how many times I run this the transaction_timestamp() is less then statement_timestamp(). This is as it should be, as transaction_timestamp() fires with the begin; and statement_timestamp() fires at select ... and then updates with the subsequent select .... and clock_timestamp() just keeps marching on through the whole process.

Oh dear...

David Johnson pointed out that I was misinterpreting my results. Sigh... It all makes sense now.

Sorry to have troubled you all with this.