Re: logical decoding and replication of sequences - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: logical decoding and replication of sequences
Date
Msg-id 9defb937-20a9-1d9f-6972-f1c4b4da4f73@enterprisedb.com
Whole thread Raw
In response to Re: logical decoding and replication of sequences  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: logical decoding and replication of sequences  (Amit Kapila <amit.kapila16@gmail.com>)
Re: logical decoding and replication of sequences  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-hackers
Hi,

Here's an updated version of the patch, fixing most of the issues from
reviews so far. There's still a couple FIXME comments, but I think those
are minor and/or straightforward to deal with.

The main improvements:

1) Elimination of a lot of redundant code - one function handling
tables, and an almost exact copy handling sequences. Now a single
function handles both, possibly with "sequences" flag to tweak the behavior.

2) A couple other functions gained "sequences" flag, determining which
objects are "interesting". For example PublicationAddSchemas needs to
know whether it's FOR ALL SEQUENCES or FOR ALL TABLES IN SCHEMA. I don't
think we can just use relkind here easily, because for tables we need to
handle various types of tables (regular, partitioned, ...).

3) I also renamed a couple functions with "tables" in the name, which
are now used for sequences too. So for example OpenTablesList() is
renamed to OpenRelationList() and so on.

4) Addition of a number of regression tests to "publication.sql", which
showed a lot of issues, mostly related to not distinguishing tables and
sequences when handling "FOR ALL TABLES [IN SCHEMA]" and "FOR ALL
SEQUENCES [IN SCHEMA]".

5) Proper tracking of FOR ALL [TABLES|SEQUENCES] IN SCHEMA in a catalog.
The pg_publication_namespace gained a pnsequences flag, which determines
which case it is. So for example if you do

  ALTER PUBLICATION p ADD ALL TABLES IN SCHEMA s;
  ALTER PUBLICATION p ADD ALL SEQUENCES IN SCHEMA s;

there will be two rows in the catalog, one with 't' and one with 'f' in
the new column. I'm not sure this is the best way to track this - maybe
it'd be better to have two flags, and keep a single row. Or maybe we
should have an array of relkinds (but that has the issue with tables
having multiple relkinds mentioned before). Multiple rows make it more
convenient to add/remove publication schemas - with a single row it'd be
necessary to either insert a new row or update an existing one when
adding the schema, and similarly for dropping it.

But maybe there are reasons / precedent to design this differently ...

6) I'm not entirely sure the object_address changes (handling of the
pnsequences flag) are correct.

7) This updates psql to do roughly the same thing as for tables, so \dRp
now list sequences added either directly or through schema, so you might
get footer like this:

  \dRp+ testpub_mix
  ...
  Tables:
      "public.testpub_tbl1"
  Tables from schemas:
      "pub_test"
  Sequences:
      "public.testpub_seq1"
  Sequences from schemas:
      "pub_test"

Maybe it's a bit too verbose, though. It also addes "All sequences" and
"Sequences" columns into the publication description, but I don't think
that can be done much differently.

FWIW I had to switch the describeOneTableDetails() chunk dealing with
sequences from printQuery() to printTable() in order to handle dynamic
footers.

There's also a change in \dn+ because a schema may be included in one
publication as "FOR ALL SEQUENCES IN SCHEMA" and in another publication
with "FOR ALL TABLES IN SCHEMA". So I modified the output to

  \dn+ pub_test1
  ...
  Publications:
      "testpub_schemas" (sequences)
      "testpub_schemas" (tables)

But maybe it'd be better to aggregate this into a single line like

  \dn+ pub_test1
  ...
  Publications:
      "testpub_schemas" (tables, sequences)

Opinions?

8) There's a shortcoming in the current grammar, because you can specify
either

  CREATE PUBLICATION p FOR ALL TABLES;

or

  CREATE PUBLICATION p FOR ALL SEQUENCES;

but it's not possible to do

  CREATE PUBLICATION p FOR ALL TABLES AND FOR ALL SEQUENCES;

which seems like a fairly reasonable thing users might want to do.

The problem is that "FOR ALL TABLES" (and same for sequences) is
hard-coded in the grammar, not defined as PublicationObjSpec. This also
means you can't do

  ALTER PUBLICATION p ADD ALL TABLES;

AFAICS there are two ways to fix this - adding the combinations into the
definition of CreatePublicationStmt, or adding FOR ALL TABLES (and
sequences) to PublicationObjSpec.

9) Another grammar limitation is that we don't cross-check the relkind,
so for example

  ALTER PUBLICATION p ADD TABLE sequence;

might actually work. Should be easy to fix, though.

10) Added pg_dump support (including tests). I'll add more tests, to
check more grammar combinations.

11) I need to test more grammar combinations in the TAP test too, to
verify the output plugin interprets the stuff correctly.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Frontend error logging style
Next
From: Tomas Vondra
Date:
Subject: Re: logical decoding and replication of sequences