Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Pgoutput not capturing the generated columns
Date
Msg-id CAHut+PuYOE3C_uv1MAD+k7K7wO0Aj1ffOtRAAeoy4FrhD-2T9g@mail.gmail.com
Whole thread Raw
In response to Re: Pgoutput not capturing the generated columns  (Shlok Kyal <shlok.kyal.oss@gmail.com>)
List pgsql-hackers
Here are some review comments for the patch v10-0002.

======
Commit Message

1.
Note that we don't copy columns when the subscriber-side column is also
generated. Those will be filled as normal with the subscriber-side computed or
default data.

~

Now this patch also introduced some errors etc, so I think that patch
comment should be written differently to explicitly spell out
behaviour of every combination, something like the below:

Summary

when (include_generated_column = true)

* publisher not-generated column => subscriber not-generated column:
This is just normal logical replication (not changed by this patch).

* publisher not-generated column => subscriber generated column: This
will give ERROR.

* publisher generated column => subscriber not-generated column: The
publisher generated column value is copied.

* publisher generated column => subscriber generated column: The
publisher generated column value is not copied. The subscriber
generated column will be filled with the subscriber-side computed or
default data.

when (include_generated_columns = false)

* publisher not-generated column => subscriber not-generated column:
This is just normal logical replication (not changed by this patch).

* publisher not-generated column => subscriber generated column: This
will give ERROR.

* publisher generated column => subscriber not-generated column: This
will replicate nothing. Publisher generate-column is not replicated.
The subscriber column will be filled with the subscriber-side default
data.

* publisher generated column => subscriber generated column:  This
will replicate nothing. Publisher generate-column is not replicated.
The subscriber generated column will be filled with the
subscriber-side computed or default data.

======
src/backend/replication/logical/relation.c

2.
logicalrep_rel_open:

I tested some of the "missing column" logic, and got the following results:

Scenario A:
PUB
test_pub=# create table t2(a int, b int);
test_pub=# create publication pub2 for table t2;
SUB
test_sub=# create table t2(a int, b int generated always as (a*2) stored);
test_sub=# create subscription sub2 connection 'dbname=test_pub'
publication pub2 with (include_generated_columns = false);
Result:
ERROR:  logical replication target relation "public.t2" is missing
replicated column: "b"

~

Scenario B:
PUB/SUB identical to above, but subscription sub2 created "with
(include_generated_columns = true);"
Result:
ERROR:  logical replication target relation "public.t2" has a
generated column "b" but corresponding column on source relation is
not a generated column

~~~

2a. Question

Why should we get 2 different error messages for what is essentially
the same problem according to whether the 'include_generated_columns'
is false or true? Isn't the 2nd error message the more correct and
useful one for scenarios like this involving generated columns?

Thoughts?

~

2b. Missing tests?

I also noticed there seems no TAP test for the current "missing
replicated column" message. IMO there should be a new test introduced
for this because the loop involved too much bms logic to go
untested...

======
src/backend/replication/logical/tablesync.c

make_copy_attnamelist:
NITPICK - minor comment tweak
NITPICK - add some spaces after "if" code

3.
Should you pfree the gencollist at the bottom of this function when
you no longer need it, for tidiness?

~~~

4.
 static void
-fetch_remote_table_info(char *nspname, char *relname,
+fetch_remote_table_info(char *nspname, char *relname, bool **remotegenlist,
  LogicalRepRelation *lrel, List **qual)
 {
  WalRcvExecResult *res;
  StringInfoData cmd;
  TupleTableSlot *slot;
  Oid tableRow[] = {OIDOID, CHAROID, CHAROID};
- Oid attrRow[] = {INT2OID, TEXTOID, OIDOID, BOOLOID};
+ Oid attrRow[] = {INT2OID, TEXTOID, OIDOID, BOOLOID, BOOLOID};
  Oid qualRow[] = {TEXTOID};
  bool isnull;
+ bool    *remotegenlist_res;

IMO the names 'remotegenlist' and 'remotegenlist_res' should be
swapped the other way around, because it is the function parameter
that is the "result", whereas the 'remotegenlist_res' is just the
local working var for it.

~~~

5. fetch_remote_table_info

Now walrcv_server_version(LogRepWorkerWalRcvConn) is used in multiple
places, I think it will be better to assign this to a 'server_version'
variable to be used everywhere instead of having multiple function
calls.

~~~

6.
  "SELECT a.attnum,"
  "       a.attname,"
  "       a.atttypid,"
- "       a.attnum = ANY(i.indkey)"
+ "       a.attnum = ANY(i.indkey),"
+ " a.attgenerated != ''"
  "  FROM pg_catalog.pg_attribute a"
  "  LEFT JOIN pg_catalog.pg_index i"
  "       ON (i.indexrelid = pg_get_replica_identity_index(%u))"
  " WHERE a.attnum > 0::pg_catalog.int2"
- "   AND NOT a.attisdropped %s"
+ "   AND NOT a.attisdropped", lrel->remoteid);
+
+ if ((walrcv_server_version(LogRepWorkerWalRcvConn) >= 120000 &&
+ walrcv_server_version(LogRepWorkerWalRcvConn) <= 160000) ||
+ !MySubscription->includegencols)
+ appendStringInfo(&cmd, " AND a.attgenerated = ''");
+

If the server version is < PG12 then AFAIK there was no such thing as
"a.attgenerated", so shouldn't that SELECT " a.attgenerated != ''"
part also be guarded by some version checking condition like in the
WHERE? Otherwise won't it cause an ERROR for old servers?

~~~

7.
  /*
- * For non-tables and tables with row filters, we need to do COPY
- * (SELECT ...), but we can't just do SELECT * because we need to not
- * copy generated columns. For tables with any row filters, build a
- * SELECT query with OR'ed row filters for COPY.
+ * For non-tables and tables with row filters and when
+ * 'include_generated_columns' is specified as 'true', we need to do
+ * COPY (SELECT ...), as normal COPY of generated column is not
+ * supported. For tables with any row filters, build a SELECT query
+ * with OR'ed row filters for COPY.
  */

NITPICK. I felt this was not quite right. AFAIK the reasons for using
this COPY (SELECT ...) syntax is different for row-filters and
generated-columns. Anyway, I updated the comment slightly in my
nitpicks attachment. Please have a look at it to see if you agree with
the suggestions. Maybe I am wrong.

~~~

8.
- for (int i = 0; i < lrel.natts; i++)
+ foreach_ptr(String, att_name, attnamelist)

I'm not 100% sure, but isn't foreach_node the macro to use here,
rather than foreach_ptr?
======
src/test/subscription/t/011_generated.pl

9.
Please discuss with Shubham how to make all the tab1, tab2, tab3,
tab4, tab5, tab6 comments use the same kind of style/wording.
Currently, the patches 0001 and 0002 test comments are a bit
inconsistent.

~~~

10.
Related to above -- now that patch 0002 supports copy_data=true I
don't see why we need to test generated columns *both* for
copy_data=false and also for copy_data=true. IOW, is it really
necessary to have so many tables/tests? For example, I am thinking
some of those tests from patch 0001 can be re-used or just removed now
that copy_data=true works.

~~~

NITPICK - minor comment tweak

~~~

11.
For tab4 and tab6 I saw the initial sync and normal replication data
tests are all merged together, but I had expected to see the initial
sync and normal replication data tests separated so it would be
consistent with the earlier tab1, tab2, tab3 tests.

======

99.
Also, I have attached a nitpicks diff for some of the cosmetic review
comments mentioned above. Please apply whatever of these that you
agree with.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: speed up a logical replica setup
Next
From: "Fujii.Yuki@df.MitsubishiElectric.co.jp"
Date:
Subject: RE: Partial aggregates pushdown