Re: Add support for specifying tables in pg_createsubscriber. - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Add support for specifying tables in pg_createsubscriber.
Date
Msg-id CAHut+Ps72=5=AQPPrQ04mbLmFu4cnQVTCRDn9qwTbfCY6WhCGQ@mail.gmail.com
Whole thread Raw
In response to Re: Add support for specifying tables in pg_createsubscriber.  (Shubham Khanna <khannashubham1197@gmail.com>)
List pgsql-hackers
Hi Shubham,

My first impression of patch v4-0001 was that the new design is adding
unnecessary complexity by introducing a new --existing-publication
option, instead of just allowing reinterpretation of the --publication
by allowing it to name/reuse an existing publication.

e.g.

Q. What if there are multiple databases, but there's a publication on
only one of them? How can the user specify this, given that the rules
say there must be the same number of --existing-publication as
--database? The rules also say --existing-publication and
--publication cannot coexist. So what is this user supposed to do in
this scenario?

Q. What if there are multiple databases with publications, but the
user wants to use the existing publication for only one of those and
FOR ALL TABLES for the other one? The rules say there must be the same
number of --existing-publication as --database, so how can the user
achieve what they want?

~~

AFAICT, these problems don't happen if you allow a reinterpretation of
--publication switch like Euler had suggested [1]. The dry-run logs
should make it clear whether an existing publication will be reused or
not, so I'm not sure even that the validation code (checking if the
publication exists) is needed.

Finally, I think the implementation might be simpler too -- e.g. fewer
rules/docs needed, no option conflict code needed.

Anyway, maybe you disagree. But, even if you still decide an
--existing-publication option is needed, IMO the rules should be
relaxed to permit this to co-exist with the --publication option, so
that the scenarios described above can have solutions.

~

Below are some more review comments for patch v4-0001, but since I had
doubts about the basic design, I did not review it in much detail.

======
Commit message

1.
The number of existing publication names must match the number of database
names, following the same pattern as other pg_createsubscriber options.

~

This assumes that if one database has existing publications, then they
all will. That does not seem like a valid assumption.

~~~

2.
This design aligns with the principle of other PostgreSQL tools that
allow both creation of new objects and reuse of existing ones.

~

What tools? Can you provide examples? I'd like to check the design
alignment for myself.

======
doc/src/sgml/ref/pg_createsubscriber.sgml

3.
+      <para>
+       This option cannot be used together with <option>--publication</option>.
+       Use either <option>--publication</option> to create new publications
+       or <option>--existing-publication</option> to use existing ones, but
+       not both.
+      </para>

Why not? How else can a user say to use an existing publication for
one database when another database does not have any publications?

~~~

4.
+      <para>
+       The number of existing publication names specified must equal the number
+       of database names when multiple databases are being configured. Each
+       existing publication will be used for its corresponding database in
+       the order specified.
+      </para>

Ditto above. What about when the second database does not even have an
existing publication?

======
src/bin/pg_basebackup/pg_createsubscriber.c

5.
-
 /*
  * Cleanup objects that were created by pg_createsubscriber if there is an

Does this whitespace change belong be in this patch?

~~~

6a.
- if (dbinfo->made_publication)
+ if (dbinfo->made_publication && !dbinfo->is_existing_publication)

6b.
- if (dbinfo->made_publication)
+ if (dbinfo->made_publication && !dbinfo->is_existing_publication)

~

Why do you need to check both flags in these places? IIUC, it's the
publication can be "made" or "existing"; it cannot be both. Why not
just check "made"?

~~~

7.
+ if (PQresultStatus(res) == PGRES_TUPLES_OK && PQntuples(res) == 1)
+ exists = true;
+ else if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ pg_fatal("could not check for publication \"%s\" in database \"%s\": %s",
+ pubname, dbname, PQerrorMessage(conn));

This logic should not need to check PQresultStatus(res) multiple times.

======
.../t/040_pg_createsubscriber.pl

8.
+# Run pg_createsubscriber with invalid --existing-publication option
+# (conflict with --publication)
+command_fails_like(

Why not allow these to coexist?

======
[1] Euler - https://www.postgresql.org/message-id/30cc34eb-07a0-4b55-b4fe-6c526886b2c4%40app.fastmail.com

Kind Regards,
Peter Smith
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Raw parse tree is not dumped to log
Next
From: Alexandra Wang
Date:
Subject: Re: SQL:2023 JSON simplified accessor support