Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Pgoutput not capturing the generated columns |
Date | |
Msg-id | CAHut+Pv50FoqfM7f-3SJ441UDzLfG9LqgggLm1Cm0DcGG6VCnQ@mail.gmail.com Whole thread Raw |
In response to | Re: Pgoutput not capturing the generated columns (Shubham Khanna <khannashubham1197@gmail.com>) |
Responses |
Re: Pgoutput not capturing the generated columns
|
List | pgsql-hackers |
Here are review comments for v15-0001 ====== doc/src/sgml/ddl.sgml nitpick - there was a comma (,) which should be a period (.) ====== .../libpqwalreceiver/libpqwalreceiver.c 1. + if (options->proto.logical.include_generated_columns && + PQserverVersion(conn->streamConn) >= 170000) + appendStringInfoString(&cmd, ", include_generated_columns 'true'"); + Should now say >= 180000 ====== src/backend/replication/pgoutput/pgoutput.c nitpick - comment wording for RelationSyncEntry.collist. ~~ 2. pgoutput_column_list_init: I found the current logic to be quite confusing. I assume the code is working OK, because AFAIK there are plenty of tests and they are all passing, but the logic seems somewhat repetitive and there are also no comments to explain it adding to my confusion. IIUC, PRIOR TO THIS PATCH: BMS field 'columns' represented the "columns of the column list" or it was NULL if there was no publication column list (and it was also NULL if the column list contained every column). IIUC NOW, WITH THIS PATCH: The BMS field 'columns' meaning is changed slightly to be something like "columns to be replicated" or NULL if all columns are to be replicated. This is almost the same thing except we are now handing the generated columns up-front, so generated columns will or won't appear in the BMS according to the "include_generated_columns" parameter. See how this is all a bit subtle which is why copious new comments are required to explain it... So, although the test result evidence suggests this is working OK, I have many questions/issues about it. Here are some to start with: 2a. It needs a lot more (summary and detailed) comments explaining the logic now that the meaning is slightly different. 2b. What is the story with the FOR ALL TABLES case now? Previously, there would always be NULL 'columns' for "FOR ALL TABLES" case -- the comment still says so. But now you've tacked on a 2nd pass of iterations to build the BMS outside of the "if (!pub->alltables)" check. Is that OK? 2c. The following logic seemed unexpected: - if (bms_num_members(cols) == nliveatts) + if (bms_num_members(cols) == nliveatts && + data->include_generated_columns) { bms_free(cols); cols = NULL; ` I had thought the above code would look different -- more like: if (att->attgenerated && !data->include_generated_columns) continue; nliveatts++; ... 2d. Was so much duplicated code necessary? It feels like the whole "Get the number of live attributes." and assignment of cols to NULL might be made common to both code paths. 2e. I'm beginning to question the pros/cons of the new BMS logic; I had suggested trying this way (processing the generated columns up-front in the BMS 'columns' list) to reduce patch code and simplify all the subsequent API delegation of "include_generated_cloumns" everywhere like it was in v14-0001. Indeed, that part was a success and the patch is now smaller. But I don't like much that we've traded reduced code overall for increased confusing code in that BMS function. If all this BMS code can be refactored and commented to be easier to understand then maybe all will be well, but if it can't then maybe this BMS change was a bridge too far. I haven't given up on it just yet, but I wonder what was your opinion about it, and do other people have thoughts about whether this was the good direction to take? ====== src/bin/pg_dump/pg_dump.c 3. + if (fout->remoteVersion >= 170000) + appendPQExpBufferStr(query, + " s.subincludegencols\n"); + else + appendPQExpBufferStr(query, + " false AS subincludegencols\n"); Should now say >= 180000 ====== src/bin/psql/describe.c 4. + /* include_generated_columns is only supported in v18 and higher */ + if (pset.sversion >= 170000) + appendPQExpBuffer(&buf, + ", subincludegencols AS \"%s\"\n", + gettext_noop("Include generated columns")); + Should now say >= 180000 ====== src/include/catalog/pg_subscription.h nitpick - let's make the comment the same as in WalRcvStreamOptions ====== src/include/replication/logicalproto.h nitpick - extern for logicalrep_write_update should be unchanged by this patch ====== src/test/regress/sql/subscription.sql nitpick = the comment "include_generated_columns and copy_data = true are mutually exclusive" is not necessary because this all falls under the existing comment "fail - invalid option combinations" nitpick - let's explicitly put "copy_data = true" in the CREATE SUBSCRIPTION to make it more obvious ====== 99. Please also refer to the attached 'diffs' patch which implements all of my nitpicks issues mentioned above. ====== Kind Regards, Peter Smith. Fujitsu Australia
Attachment
pgsql-hackers by date: