Re: logical decoding and replication of sequences - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences
Date
Msg-id e890448d-00b5-18d0-e894-85d17660d484@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: logical decoding and replication of sequences  (Amit Kapila <amit.kapila16@gmail.com>)
Re: logical decoding and replication of sequences  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On 2/15/22 10:00, Peter Eisentraut wrote:
> On 13.02.22 14:10, Tomas Vondra wrote:
>> Here's the remaining part, rebased, with a small tweak in the TAP test
>> to eliminate the issue with not waiting for sequence increments. I've
>> kept the tweak in a separate patch, so that we can throw it away easily
>> if we happen to resolve the issue.
> 
> This basically looks fine to me.  You have identified a few XXX and
> FIXME spots that should be addressed.
> 
> Here are a few more comments:
> 
> * general
> 
> Handling of owned sequences, as previously discussed.  This would
> probably be a localized change somewhere in get_rel_sync_entry(), so it
> doesn't affect the overall structure of the patch.
> 

So you're suggesting not to track owned sequences in pg_publication_rel
explicitly, and handle them dynamically in output plugin? So when
calling get_rel_sync_entry on the sequence, we'd check if it's owned by
a table that is replicated.

We'd want a way to enable/disable this for each publication, but that
makes the lookups more complicated. Also, we'd probably need the same
logic elsewhere (e.g. in psql, when listing sequences in a publication).

I'm not sure we want this complexity, maybe we should simply deal with
this in the ALTER PUBLICATION and track all sequences (owned or not) in
pg_publication_rel.

> pg_dump support is missing.
> 

Will fix.

> Some psql \dxxx support should probably be there.  Check where existing
> publication-table relationships are displayed.
> 

Yeah, I noticed this while adding regression tests. Currently, \dRp+
shows something like this:

                         Publication testpub_fortbl
            Owner           | All tables | Inserts | Updates  ...
  --------------------------+------------+---------+--------- ...
   regress_publication_user | f          | t       | t        ...
  Tables:
      "pub_test.testpub_nopk"
      "public.testpub_tbl1"

or

                     Publication testpub5_forschema
            Owner           | All tables | Inserts | Updates |  ...
  --------------------------+------------+---------+---------+- ...
   regress_publication_user | f          | t       | t       |  ...
  Tables from schemas:
      "CURRENT_SCHEMA"
      "public"

I wonder if we should copy the same block for sequences, so

                         Publication testpub_fortbl
            Owner           | All tables | Inserts | Updates  ...
  --------------------------+------------+---------+--------- ...
   regress_publication_user | f          | t       | t        ...
  Tables:
      "pub_test.testpub_nopk"
      "public.testpub_tbl1"
  Sequences:
      "pub_test.sequence_s1"
      "public.sequence_s2"

And same for "tables from schemas" etc.


> * src/backend/catalog/system_views.sql
> 
> +         LATERAL pg_get_publication_sequences(P.pubname) GPT,
> 
> The GPT presumably stood for "get publication tables", so should be
> changed.
> 
> (There might be a few more copy-and-paste occasions like this around.  I
> have not written down all of them.)
> 

Will fix.

> * src/backend/commands/publicationcmds.c
> 
> This adds a new publication option called "sequence".  I would have
> expected it to be named "sequences", but that's just my opinion.
> 
> But in any case, the option is not documented at all.
> 
> Some of the new functions added in this file are almost exact duplicates
> of the analogous functions for tables.  For example,
> PublicationAddSequences()/PublicationDropSequences() are almost
> exactly the same as PublicationAddTables()/PublicationDropTables(). This
> could be handled with less duplication by just adding an ObjectType
> argument to the existing functions.
> 

Yes, I noticed that too, and I'll review this later, after making sure
the behavior is correct.

> * src/backend/commands/sequence.c
> 
> Could use some refactoring of ResetSequence()/ResetSequence2().  Maybe
> call the latter ResetSequenceData() and have the former call it internally.
> 

Will check.

> * src/backend/commands/subscriptioncmds.c
> 
> Also lots of duplication between tables and sequences in this file.
> 

Same as the case above.

> * src/backend/replication/logical/tablesync.c
> 
> The comment says it needs sequence support, but there appears to be
> support for initial sequence syncing.  What exactly is missing here?
> 

I think that's just obsolete comment.

> * src/test/subscription/t/028_sequences.pl
> 
> Change to use done_testing() (see
> 549ec201d6132b7c7ee11ee90a4e02119259ba5b).

Will fix.


Do we need to handle both FOR ALL TABLES and FOR ALL SEQUENCES at the
same time? That seems like a reasonable thing people might want.

The patch probably also needs to modify pg_publication_namespace to
track whether the schema is FOR TABLES IN SCHEMA or FOR SEQUENCES.



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Mark Wong
Date:
Subject: Re: Time to drop plpython2?
Next
From: Noah Misch
Date:
Subject: Re: Timeout control within tests