Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences, take 2
Date
Msg-id 0cebc9e8-addf-ac50-474a-071a52731c33@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences, take 2  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: logical decoding and replication of sequences, take 2  (vignesh C <vignesh21@gmail.com>)
Re: logical decoding and replication of sequences, take 2  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
Hi!

On 3/16/23 08:38, Masahiko Sawada wrote:
> Hi,
> 
> On Wed, Mar 15, 2023 at 9:52 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>>
>> On 3/14/23 08:30, John Naylor wrote:
>>> I tried a couple toy examples with various combinations of use styles.
>>>
>>> Three with "automatic" reading from sequences:
>>>
>>> create table test(i serial);
>>> create table test(i int GENERATED BY DEFAULT AS IDENTITY);
>>> create table test(i int default nextval('s1'));
>>>
>>> ...where s1 has some non-default parameters:
>>>
>>> CREATE SEQUENCE s1 START 100 MAXVALUE 100 INCREMENT BY -1;
>>>
>>> ...and then two with explicit use of s1, one inserting the 'nextval'
>>> into a table with no default, and one with no table at all, just
>>> selecting from the sequence.
>>>
>>> The last two seem to work similarly to the first three, so it seems like
>>> FOR ALL TABLES adds all sequences as well. Is that expected?
>>
>> Yeah, that's a bug - we shouldn't replicate the sequence changes, unless
>> the sequence is actually added to the publication. I tracked this down
>> to a thinko in get_rel_sync_entry() which failed to check the object
>> type when puballtables or puballsequences was set.
>>
>> Attached is a patch fixing this.
>>
>>> The documentation for CREATE PUBLICATION mentions sequence options,
>>> but doesn't really say how these options should be used.
>> Good point. The idea is that we handle tables and sequences the same
>> way, i.e. if you specify 'sequence' then we'll replicate increments for
>> sequences explicitly added to the publication.
>>
>> If this is not clear, the docs may need some improvements.
>>
> 
> I'm late to this thread, but I have some questions and review comments.
> 
> Regarding sequence logical replication, it seems that changes of
> sequence created after CREATE SUBSCRIPTION are applied on the
> subscriber even without REFRESH PUBLICATION command on the subscriber.
> Which is a different behavior than tables. For example, I set both
> publisher and subscriber as follows:
> 
> 1. On publisher
> create publication test_pub for all sequences;
> 
> 2. On subscriber
> create subscription test_sub connection 'dbname=postgres port=5551'
> publication test_pub; -- port=5551 is the publisher
> 
> 3. On publisher
> create sequence s1;
> select nextval('s1');
> 
> I got the error "ERROR:  relation "public.s1" does not exist on the
> subscriber". Probably we need to do should_apply_changes_for_rel()
> check in apply_handle_sequence().
> 

Yes, you're right - the sequence handling should have been calling the
should_apply_changes_for_rel() etc.

The attached 0005 patch should fix that - I still need to test it a bit
more and maybe clean it up a bit, but hopefully it'll allow you to
continue the review.

I had to tweak the protocol a bit, so that this uses the same cache as
tables. I wonder if maybe we should make it even more similar, by
essentially treating sequences as tables with (last_value, log_cnt,
called) columns.

> If my understanding is correct, is there any case where the subscriber
> needs to apply transactional sequence changes? The commit message of
> 0001 patch says:
> 
>     * Changes for sequences created in the same top-level transaction are
>       treated as transactional, i.e. just like any other change from that
>       transaction, and discarded in case of a rollback.
> 
> IIUC such sequences are not visible to the subscriber, so it cannot
> subscribe to them until the commit.
> 

The comment is slightly misleading, as it talks about creation of
sequences, but it should be talking about relfilenodes. For example, if
you create a sequence, add it to publication, and then in a later
transaction you do

   ALTER SEQUENCE x RESTART

or something else that creates a new relfilenode, then the subsequent
increments are visible only in that transaction. But we still need to
apply those on the subscriber, but only as part of the transaction,
because it might roll back.

> ---
> I got an assertion failure. The reproducible steps are:
> 

I do believe this was due to a thinko in apply_handle_sequence, which
sometimes started transaction and didn't terminate it correctly. I've
changed it to use the begin_replication_step() etc. and it seems to be
working fine now.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: gcc 13 warnings
Next
From: Nathan Bossart
Date:
Subject: Re: improving user.c error messages