Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Pgoutput not capturing the generated columns
Date
Msg-id CAHut+Ps43FY8zK1x1826x3cwwCTMcXYjrjwqEdT2LuT3D9zChw@mail.gmail.com
Whole thread Raw
In response to Re: Pgoutput not capturing the generated columns  (Shlok Kyal <shlok.kyal.oss@gmail.com>)
Responses Re: Pgoutput not capturing the generated columns
List pgsql-hackers
Hi, here are some review comments for patch v9-0002.

======
src/backend/replication/logical/relation.c

1. logicalrep_rel_open

- if (attr->attisdropped)
+ if (attr->attisdropped ||
+ (!MySubscription->includegencols && attr->attgenerated))

You replied to my question from the previous review [1, #2] as follows:
In case 'include_generated_columns' is 'true'. column list in
remoterel will have an entry for generated columns. So, in this case
if we skip every attr->attgenerated, we will get a missing column
error.

~

TBH, the reason seems very subtle to me. Perhaps that
"(!MySubscription->includegencols && attr->attgenerated))" condition
should be coded as a separate "if", so then you can include a comment
similar to your answer, to explain it.

======
src/backend/replication/logical/tablesync.c

make_copy_attnamelist:

NITPICK - punctuation in function comment
NITPICK - add/reword some more comments
NITPICK - rearrange comments to be closer to the code they are commenting

~

2. make_copy_attnamelist.

+ /*
+ * Construct column list for COPY.
+ */
+ for (int i = 0; i < rel->remoterel.natts; i++)
+ {
+ if(!gencollist[i])
+ attnamelist = lappend(attnamelist,
+   makeString(rel->remoterel.attnames[i]));
+ }

IIUC isn't this assuming that the attribute number (aka column order)
is the same on the subscriber side (e.g. gencollist idx) and on the
publisher side (e.g. remoterel.attnames[i]).  AFAIK logical
replication does not require this ordering must be match like that,
therefore I am suspicious this new logic is accidentally imposing new
unwanted assumptions/restrictions. I had asked the same question
before [1-#4] about this code, but there was no reply.

Ideally, there would be more test cases for when the columns
(including the generated ones) are all in different orders on the
pub/sub tables.

~~~

3. General - varnames.

It would help with understanding if the 'attgenlist' variables in all
these functions are re-named to make it very clear that this is
referring to the *remote* (publisher-side) table genlist, not the
subscriber table genlist.

~~~

4.
+ int i = 0;
+
  appendStringInfoString(&cmd, "COPY (SELECT ");
- for (int i = 0; i < lrel.natts; i++)
+ foreach_ptr(ListCell, l, attnamelist)
  {
- appendStringInfoString(&cmd, quote_identifier(lrel.attnames[i]));
- if (i < lrel.natts - 1)
+ appendStringInfoString(&cmd, quote_identifier(strVal(l)));
+ if (i < attnamelist->length - 1)
  appendStringInfoString(&cmd, ", ");
+ i++;
  }

4a.
I think the purpose of the new macros is to avoid using ListCell, and
also 'l' is an unhelpful variable name. Shouldn't this code be more
like:
foreach_node(String, att_name, attnamelist)

~

4b.
The code can be far simpler if you just put the comma (", ") always
up-front except the *first* iteration, so you can avoid checking the
list length every time. For example:

if (i++)
  appendStringInfoString(&cmd, ", ");

======
src/test/subscription/t/011_generated.pl

5. General.

Hmm. This patch 0002 included many formatting changes to tables tab2
and tab3 and subscriptions sub2 and sub3 but they do not belong here.
The proper formatting for those needs to be done back in patch 0001
where they were introduced. Patch 0002 should just concentrate only on
the new stuff for patch 0002.

~

6. CREATE TABLES would be better in pairs

IMO it will be better if the matching CREATE TABLE for pub and sub are
kept together, instead of separating them by doing all pub then all
sub. I previously made the same comment for patch 0001, so maybe it
will be addressed next time...

~

7. SELECT *

+$result =
+  $node_subscriber->safe_psql('postgres', "SELECT * FROM tab4 ORDER BY a");

It will be prudent to do explicit "SELECT a,b,c" instead of "SELECT
*", for readability and so there are no surprises.

======

99.
Please also refer to my attached nitpicks diff for numerous cosmetic
changes, and apply if you agree.

======
[1] https://www.postgresql.org/message-id/CAHut%2BPtAsEc3PEB1KUk1kFF5tcCrDCCTcbboougO29vP1B4E2Q%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add pg_get_acl() function get the ACL for a database object
Next
From: Tom Lane
Date:
Subject: Re: Add pg_get_acl() function get the ACL for a database object