Commit Timestamp and LSN Inversion issue - Mailing list pgsql-hackers

From shveta malik
Subject Commit Timestamp and LSN Inversion issue
Date
Msg-id CAJpy0uBxEJnabEp3JS=n9X19Vx2ZK3k5AR7N0h-cSMtOwYV3fA@mail.gmail.com
Whole thread Raw
Responses Re: Commit Timestamp and LSN Inversion issue
List pgsql-hackers
Hello hackers,
(Cc people involved in the earlier discussion)

I would like to discuss the $Subject.

While discussing Logical Replication's Conflict Detection and
Resolution (CDR) design in [1] , it came to  our notice that the
commit LSN and timestamp may not correlate perfectly i.e. commits may
happen with LSN1 < LSN2 but with Ts1 > Ts2. This issue may arise
because, during the commit process, the timestamp (xactStopTimestamp)
is captured slightly earlier than when space is reserved in the WAL.

 ~~

 Reproducibility of conflict-resolution problem due to the timestamp inversion
------------------------------------------------
It was suggested that timestamp inversion *may* impact the time-based
resolutions such as last_update_wins (targeted to be implemented in
[1]) as we may end up making wrong decisions if timestamps and LSNs
are not correctly ordered. And thus we tried some tests but failed to
find any practical scenario where it could be a problem.

Basically, the proposed conflict resolution is a row-level resolution,
and to cause the row value to be inconsistent, we need to modify the
same row in concurrent transactions and commit the changes
concurrently. But this doesn't seem possible because concurrent
updates on the same row are disallowed (e.g., the later update will be
blocked due to the row lock).  See [2] for the details.

We tried to give some thoughts on multi table cases as well e.g.,
update table A with foreign key and update the table B that table A
refers to. But update on table A will block the update on table B as
well, so we could not reproduce data-divergence due to the
LSN/timestamp mismatch issue there.

 ~~

Idea proposed to fix the timestamp inversion issue
------------------------------------------------
There was a suggestion in [3] to acquire the timestamp while reserving
the space (because that happens in LSN order). The clock would need to
be monotonic (easy enough with CLOCK_MONOTONIC), but also cheap. The
main problem why it's being done outside the critical section, because
gettimeofday() may be quite expensive. There's a concept of hybrid
clock, combining "time" and logical counter, which might be useful
independently of CDR.

On further analyzing this idea, we found that CLOCK_MONOTONIC can be
accepted only by clock_gettime() which has more precision than
gettimeofday() and thus is equally or more expensive theoretically (we
plan to test it and post the results). It does not look like a good
idea to call any of these when holding spinlock to reserve the wal
position.  As for the suggested solution "hybrid clock", it might not
help here because the logical counter is only used to order the
transactions with the same timestamp. The problem here is how to get
the timestamp along with wal position
reservation(ReserveXLogInsertLocation).

 ~~

 We can explore further but as we are not able to see any real-time
scenario where this could actually be problem, it may or may not be
worth to spend time on this. Thoughts?


[1]:
(See: "How is this going to deal with the fact that commit LSN and
timestamps may not correlate perfectly?").
https://www.postgresql.org/message-id/CAJpy0uBWBEveM8LO2b7wNZ47raZ9tVJw3D2_WXd8-b6LSqP6HA%40mail.gmail.com

[2]:
https://www.postgresql.org/message-id/CAA4eK1JTMiBOoGqkt%3DaLPLU8Rs45ihbLhXaGHsz8XC76%2BOG3%2BQ%40mail.gmail.com

[3]:
(See: "The clock would need to be monotonic (easy enough with
CLOCK_MONOTONIC").
https://www.postgresql.org/message-id/a3a70a19-a35e-426c-8646-0898cdc207c8%40enterprisedb.com

thanks
Shveta



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Parallel workers stats in pg_stat_database
Next
From: Dean Rasheed
Date:
Subject: Re: Optimize mul_var() for var1ndigits >= 8