Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences, take 2
Date
Msg-id f5a9d63d-a6fe-59a9-d1ed-38f6a5582c13@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences, take 2  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: logical decoding and replication of sequences, take 2
List pgsql-hackers

On 3/17/23 06:53, John Naylor wrote:
> On Wed, Mar 15, 2023 at 7:51 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
> wrote:
>>
>>
>>
>> On 3/14/23 08:30, John Naylor wrote:
>> > I tried a couple toy examples with various combinations of use styles.
>> >
>> > Three with "automatic" reading from sequences:
>> >
>> > create table test(i serial);
>> > create table test(i int GENERATED BY DEFAULT AS IDENTITY);
>> > create table test(i int default nextval('s1'));
>> >
>> > ...where s1 has some non-default parameters:
>> >
>> > CREATE SEQUENCE s1 START 100 MAXVALUE 100 INCREMENT BY -1;
>> >
>> > ...and then two with explicit use of s1, one inserting the 'nextval'
>> > into a table with no default, and one with no table at all, just
>> > selecting from the sequence.
>> >
>> > The last two seem to work similarly to the first three, so it seems like
>> > FOR ALL TABLES adds all sequences as well. Is that expected?
>>
>> Yeah, that's a bug - we shouldn't replicate the sequence changes, unless
>> the sequence is actually added to the publication. I tracked this down
>> to a thinko in get_rel_sync_entry() which failed to check the object
>> type when puballtables or puballsequences was set.
>>
>> Attached is a patch fixing this.
> 
> Okay, I can verify that with 0001-0006, sequences don't replicate unless
> specified. I do see an additional change that doesn't make sense: On the
> subscriber I no longer see a jump to the logged 32 increment, I see the
> very next value:
> 
> # alter system set wal_level='logical';
> # port 7777 is subscriber
> 
> echo
> echo "PUB:"
> psql -c "drop table if exists test;"
> psql -c "drop publication if exists pub1;"
> 
> echo
> echo "SUB:"
> psql -p 7777 -c "drop table if exists test;"
> psql -p 7777 -c "drop subscription if exists sub1 ;"
> 
> echo
> echo "PUB:"
> psql -c "create table test(i int GENERATED BY DEFAULT AS IDENTITY);"
> psql -c "CREATE PUBLICATION pub1 FOR ALL TABLES;"
> psql -c "CREATE PUBLICATION pub2 FOR ALL SEQUENCES;"
> 
> echo
> echo "SUB:"
> psql -p 7777 -c "create table test(i int GENERATED BY DEFAULT AS IDENTITY);"
> psql -p 7777 -c "CREATE SUBSCRIPTION sub1 CONNECTION 'host=localhost
> dbname=postgres application_name=sub1 port=5432' PUBLICATION pub1;"
> psql -p 7777 -c "CREATE SUBSCRIPTION sub2 CONNECTION 'host=localhost
> dbname=postgres application_name=sub2 port=5432' PUBLICATION pub2;"
> 
> echo
> echo "PUB:"
> psql -c "insert into test default values;"
> psql -c "insert into test default values;"
> psql -c "select * from test;"
> psql -c "select * from test_i_seq;"
> 
> sleep 1
> 
> echo
> echo "SUB:"
> psql -p 7777 -c "select * from test;"
> psql -p 7777 -c "select * from test_i_seq;"
> 
> psql -p 7777 -c "drop subscription sub1 ;"
> psql -p 7777 -c "drop subscription sub2 ;"
> 
> psql -p 7777 -c "insert into test default values;"
> psql -p 7777 -c "select * from test;"
> psql -p 7777 -c "select * from test_i_seq;"
> 
> The last two queries on the subscriber show:
> 
>  i
> ---
>  1
>  2
>  3
> (3 rows)
> 
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>           3 |      30 | t
> (1 row)
> 
> ...whereas before with 0001-0003 I saw:
> 
>  i  
> ----
>   1
>   2
>  34
> (3 rows)
> 
>  last_value | log_cnt | is_called
> ------------+---------+-----------
>          34 |      32 | t
> 

Oh, this is a silly thinko in how sequences are synced at the beginning
(or maybe a combination of two issues).

fetch_sequence_data() simply runs a select from the sequence

    SELECT last_value, log_cnt, is_called

but that's wrong, because that's the *current* state of the sequence, at
the moment it's initially synced. We to make this "correct" with respect
to the decoding, we'd need to deduce what was the last WAL record, so
something like

    last_value += log_cnt + 1

That should produce 34 again.

FWIW the older patch has this issue too, I believe the difference is
merely due to a slightly different timing between the sync and decoding
the first insert. If you insert a sleep after the CREATE SUBSCRIPTION
commands, it should disappear.


This however made me realize the initial sync of sequences may not be
correct. I mean, the idea of tablesync is syncing the data in REPEATABLE
READ transaction, and then applying decoded changes. But sequences are
not transactional in this way - if you select from a sequence, you'll
always see the latest data, even in REPEATABLE READ.

I wonder if this might result in losing some of the sequence increments,
and/or applying them in the wrong order (so that the sequence goes
backward for a while).


>> > The documentation for CREATE PUBLICATION mentions sequence options,
>> > but doesn't really say how these options should be used.
>> Good point. The idea is that we handle tables and sequences the same
>> way, i.e. if you specify 'sequence' then we'll replicate increments for
>> sequences explicitly added to the publication.
>>
>> If this is not clear, the docs may need some improvements.
> 
> Aside from docs, I'm not clear what some of the tests are doing:
> 
> +CREATE PUBLICATION testpub_forallsequences FOR ALL SEQUENCES WITH
> (publish = 'sequence');
> +RESET client_min_messages;
> +ALTER PUBLICATION testpub_forallsequences SET (publish = 'insert,
> sequence');
> 
> What does it mean to add 'insert' to a sequence publication?
> 

I don't recall why this particular test exists, but you can still add
tables to "for all sequences" publication. IMO it's fine to allow adding
actions that are irrelevant for currently published objects, we don't
have a cross-check to prevent that (how would you even do that e.g. for
FOR ALL TABLES publications?).

> Likewise, from a brief change in my test above, 'sequence' seems to be a
> noise word for table publications. I'm not fully read up on the
> background of this topic, but wanted to make sure I understood the
> design of the syntax.
> 

I think it's fine, for the same reason as above.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_dump versus hash partitioning
Next
From: Jeff Davis
Date:
Subject: Re: ICU locale validation / canonicalization