Re: Logical Replication of sequences - Mailing list pgsql-hackers

From vignesh C
Subject Re: Logical Replication of sequences
Date
Msg-id CALDaNm0Tj+zn1_RyxRQMxRBtBuO9zbipKJ4o0u0vXc97+pJFzQ@mail.gmail.com
Whole thread Raw
In response to Re: Logical Replication of sequences  (shveta malik <shveta.malik@gmail.com>)
List pgsql-hackers
On Wed, 7 Aug 2024 at 10:27, shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Aug 5, 2024 at 10:26 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Thu, 1 Aug 2024 at 04:25, Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > Hi Vignesh,
> > >
> > > I noticed that when replicating sequences (using the latest patches
> > > 0730_2*)  the subscriber-side checks the *existence* of the sequence,
> > > but apparently it is not checking other sequence attributes.
> > >
> > > For example, consider:
> > >
> > > Publisher: "CREATE SEQUENCE s1 START 1 INCREMENT 2;" should be a
> > > sequence of only odd numbers.
> > > Subscriber: "CREATE SEQUENCE s1 START 2 INCREMENT 2;" should be a
> > > sequence of only even numbers.
> > >
> > > Because the names match, currently the patch allows replication of the
> > > s1 sequence. I think that might lead to unexpected results on the
> > > subscriber. IMO it might be safer to report ERROR unless the sequences
> > > match properly (i.e. not just a name check).
> > >
> > > Below is a demonstration the problem:
> > >
> > > ==========
> > > Publisher:
> > > ==========
> > >
> > > (publisher sequence is odd numbers)
> > >
> > > test_pub=# create sequence s1 start 1 increment 2;
> > > CREATE SEQUENCE
> > > test_pub=# select * from nextval('s1');
> > >  nextval
> > > ---------
> > >        1
> > > (1 row)
> > >
> > > test_pub=# select * from nextval('s1');
> > >  nextval
> > > ---------
> > >        3
> > > (1 row)
> > >
> > > test_pub=# select * from nextval('s1');
> > >  nextval
> > > ---------
> > >        5
> > > (1 row)
> > >
> > > test_pub=# CREATE PUBLICATION pub1 FOR ALL SEQUENCES;
> > > CREATE PUBLICATION
> > > test_pub=#
> > >
> > > ==========
> > > Subscriber:
> > > ==========
> > >
> > > (subscriber sequence is even numbers)
> > >
> > > test_sub=# create sequence s1 start 2 increment 2;
> > > CREATE SEQUENCE
> > > test_sub=# SELECT * FROM nextval('s1');
> > >  nextval
> > > ---------
> > >        2
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > >  nextval
> > > ---------
> > >        4
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > >  nextval
> > > ---------
> > >        6
> > > (1 row)
> > >
> > > test_sub=# CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=test_pub'
> > > PUBLICATION pub1;
> > > 2024-08-01 08:43:04.198 AEST [24325] WARNING:  subscriptions created
> > > by regression test cases should have names starting with "regress_"
> > > WARNING:  subscriptions created by regression test cases should have
> > > names starting with "regress_"
> > > NOTICE:  created replication slot "sub1" on publisher
> > > CREATE SUBSCRIPTION
> > > test_sub=# 2024-08-01 08:43:04.294 AEST [26240] LOG:  logical
> > > replication apply worker for subscription "sub1" has started
> > > 2024-08-01 08:43:04.309 AEST [26244] LOG:  logical replication
> > > sequence synchronization worker for subscription "sub1" has started
> > > 2024-08-01 08:43:04.323 AEST [26244] LOG:  logical replication
> > > synchronization for subscription "sub1", sequence "s1" has finished
> > > 2024-08-01 08:43:04.323 AEST [26244] LOG:  logical replication
> > > sequence synchronization worker for subscription "sub1" has finished
> > >
> > > (after the CREATE SUBSCRIPTION we are getting replicated odd values
> > > from the publisher, even though the subscriber side sequence was
> > > supposed to be even numbers)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > >  nextval
> > > ---------
> > >        7
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > >  nextval
> > > ---------
> > >        9
> > > (1 row)
> > >
> > > test_sub=# SELECT * FROM nextval('s1');
> > >  nextval
> > > ---------
> > >       11
> > > (1 row)
> > >
> > > (Looking at the description you would expect odd values for this
> > > sequence to be impossible)
>
> I see that for such even sequences, user can still do 'setval'  to a
> odd number and then nextval will keep on returning odd value.
>
> postgres=# SELECT nextval('s1');
>        6
>
> postgres=SELECT setval('s1', 43);
>      43
>
> postgres=# SELECT nextval('s1');
>      45
>
> > > test_sub=# \dS+ s1
> > >                              Sequence "public.s1"
> > >   Type  | Start | Minimum |       Maximum       | Increment | Cycles? | Cache
> > > --------+-------+---------+---------------------+-----------+---------+-------
> > >  bigint |     2 |       1 | 9223372036854775807 |         2 | no      |     1
> >
> > Even if we check the sequence definition during the CREATE
> > SUBSCRIPTION/ALTER SUBSCRIPTION ... REFRESH PUBLICATION or ALTER
> > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES commands, there's still
> > a chance that the sequence definition might change after the command
> > has been executed. Currently, there's no mechanism to lock a sequence,
> > and we also permit replication of table data even if the table
> > structures differ, such as mismatched data types like int and
> > smallint. I have modified it to log a warning to inform users that the
> > sequence options on the publisher and subscriber are not the same and
> > advise them to ensure that the sequence definitions are consistent
> > between both.
> > The v20240805 version patch attached at [1] has the changes for the same.
> > [1] - https://www.postgresql.org/message-id/CALDaNm1Y_ot-jFRfmtwDuwmFrgSSYHjVuy28RspSopTtwzXy8w%40mail.gmail.com
>
> The behavior for applying is no different from setval. Having said
> that, I agree that sequence definition can change even after the
> subscription creation, but earlier we were not syncing sequences and
> thus the value of a particular sequence was going to remain in the
> range/pattern defined by its attributes unless user sets it manually
> using setval. But now, it is being changed in the background without
> user's knowledge.
> The table case is different. In case of table replication, if we have
> CHECK constraint or say primary-key etc, then the value which violates
> these constraints will never be  inserted to a table even during
> replication on sub. For sequences, parameters (MIN,MAX, START,
> INCREMENT) can be considered similar to check-constraints, the only
> difference is during apply, we are still overriding these and copying
> pub's value. May be such inconsistencies detection can be targeted
> later in next project. But for the time being, it will be good to add
> a 'caveat' section in doc mentioning all such cases. The scope of this
> project should be clearly documented.

I have added a Caveats section and mentioned it.
The changes for the same are available at v20240808 version attached at [1].
[1] - https://www.postgresql.org/message-id/CALDaNm1QQK_Pgx35LrJGuRxBzzYSO8rm1YGJF4w8hYc3Gm%2B5NQ%40mail.gmail.com

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Logical Replication of sequences
Next
From: Nathan Bossart
Date:
Subject: Re: [PATCH] Add crc32(text) & crc32(bytea)