On 3/17/23 18:55, Tomas Vondra wrote:
>
> ...
>
> This however made me realize the initial sync of sequences may not be
> correct. I mean, the idea of tablesync is syncing the data in REPEATABLE
> READ transaction, and then applying decoded changes. But sequences are
> not transactional in this way - if you select from a sequence, you'll
> always see the latest data, even in REPEATABLE READ.
>
> I wonder if this might result in losing some of the sequence increments,
> and/or applying them in the wrong order (so that the sequence goes
> backward for a while).
>
Yeah, I think my suspicion was warranted - it's pretty easy to make the
sequence go backwards for a while by adding a sleep between the slot
creation and the copy_sequence() call, and increment the sequence in
between (enough to do some WAL logging).
The copy_sequence() then reads the current on-disk state (because of the
non-transactional nature w.r.t. REPEATABLE READ), applies it, and then
we start processing the WAL added since the slot creation. But those are
older, so stuff like this happens:
21:52:54.147 CET [35404] WARNING: copy_sequence 1222 0 1
21:52:54.163 CET [35404] WARNING: apply_handle_sequence 990 0 1
21:52:54.163 CET [35404] WARNING: apply_handle_sequence 1023 0 1
21:52:54.163 CET [35404] WARNING: apply_handle_sequence 1056 0 1
21:52:54.174 CET [35404] WARNING: apply_handle_sequence 1089 0 1
21:52:54.174 CET [35404] WARNING: apply_handle_sequence 1122 0 1
21:52:54.174 CET [35404] WARNING: apply_handle_sequence 1155 0 1
21:52:54.174 CET [35404] WARNING: apply_handle_sequence 1188 0 1
21:52:54.175 CET [35404] WARNING: apply_handle_sequence 1221 0 1
21:52:54.898 CET [35402] WARNING: apply_handle_sequence 1254 0 1
Clearly, for sequences we can't quite rely on snapshots/slots, we need
to get the LSN to decide what changes to apply/skip from somewhere else.
I wonder if we can just ignore the queued changes in tablesync, but I
guess not - there can be queued increments after reading the sequence
state, and we need to apply those. But maybe we could use the page LSN
from the relfilenode - that should be the LSN of the last WAL record.
Or maybe we could simply add pg_current_wal_insert_lsn() into the SQL we
use to read the sequence state ...
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company