Re: Pgoutput not capturing the generated columns - Mailing list pgsql-hackers
From | Shubham Khanna |
---|---|
Subject | Re: Pgoutput not capturing the generated columns |
Date | |
Msg-id | CAHv8RjKOvqs8Ouodcd+jBpgGinVffqqZxAVzzbZBrt4GkDcR7w@mail.gmail.com Whole thread Raw |
In response to | Re: Pgoutput not capturing the generated columns (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Thu, Oct 10, 2024 at 10:53 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Here are some comments for TAP test patch v37-0003. > > I’m not in favour of the removal of such a large number of > 'combination' and other 'misc' tests. In the commit message, please > delete me as a "co-author" of this patch. > > ====== > > 1. > Any description or comment that still mentions "all combinations" is > no longer valid: > > (e.g. in the comment message) > Add tests for all combinations of generated column replication. > > (e.g. in the test file) > # The following test cases exercise logical replication for all combinations > # where there is a generated column on one or both sides of pub/sub: > > and > > # Furthermore, all combinations are tested using: > > ====== > 2. > +# -------------------------------------------------- > +# Testcase: generated -> normal > +# Publisher table has generated column 'b'. > +# Subscriber table has normal column 'b'. > +# -------------------------------------------------- > + > > Now that COPY for generated columns is already implemented in patch > 0001, shouldn't this test be using 'copy_data' enabled, so it can test > replication both for initial tablesync as well as normal replication? > > That was the whole point of having the "# XXX copy_data=false for now. > This will be changed later." reminder comment in this file. > > ====== > > 3. > Previously there were some misc tests to ensure that a generated > column which was then altered using DROP EXPRESSION would work as > expected. The test scenario was commented like: > > +# ============================================================================= > +# Misc test. > +# > +# A "normal -> generated" replication fails, reporting an error that the > +# subscriber side column is missing. > +# > +# In this test case we use DROP EXPRESSION to change the subscriber generated > +# column into a normal column, then verify replication works ok. > +# ============================================================================= > > Now in patch v37 this test no longer exists. Why? > > ====== > 4. > +# ============================================================================= > +# The following test cases demonstrate behavior of generated column replication > +# when publish_generated_colums=false/true: > +# > +# Test: column list includes gencols, when publish_generated_columns=false > +# Test: column list does not include gencols, when > publish_generated_columns=false > +# > +# Test: column list includes gencols, when publish_generated_columns=true > +# Test: column list does not include gencols, when > publish_generated_columns=true > +# Test: no column list, when publish_generated_columns=true > +# ============================================================================= > > These tests are currently only testing the initial tablesync > replication. Since the COPY logic is different from the normal > replication logic, I think it would be better to test some normal > replication records as well, to make sure both parts work > consistently. This comment applies to all of the following test cases. > > ~~~ > > 5. > +# Create table and publications. > +$node_publisher->safe_psql( > + 'postgres', qq( > + CREATE TABLE nogen_to_gen3 (a int, b int, gen1 int GENERATED ALWAYS > AS (a * 2) STORED, gen2 int GENERATED ALWAYS AS (a * 2) STORED); > + CREATE TABLE nogen_to_gen4 (c int, d int, gen1 int GENERATED ALWAYS > AS (c * 2) STORED, gen2 int GENERATED ALWAYS AS (c * 2) STORED); > + INSERT INTO nogen_to_gen3 VALUES (1, 1); > + INSERT INTO nogen_to_gen4 VALUES (1, 1); > + CREATE PUBLICATION pub1 FOR table nogen_to_gen3, nogen_to_gen4(gen1) > WITH (publish_generated_columns=true); > +)); > + > > 5a. > The code should do only what the comments say it does. So, the INSERTS > should be done separately after the CREATE PUBLICATION, but before the > CREATE SUBSCRIPTION. A similar change should be made for all of these > test cases. > > # Insert some initial data > INSERT INTO nogen_to_gen3 VALUES (1, 1); > INSERT INTO nogen_to_gen4 VALUES (1, 1); > > ~ > > 5b. > The tables are badly named. Why are they 'nogen_to_gen', when the > publisher side has generated cols and the subscriber side does not? > This problem seems repeated in multiple subsequent test cases. > > ~ > > 6. > +$result = $node_subscriber->safe_psql('postgres', > + "SELECT * FROM gen_to_nogen ORDER BY a"); > +is($result, qq(1|1||2), > + 'gen_to_nogen initial sync, when publish_generated_columns=false'); > + > +$result = $node_subscriber->safe_psql('postgres', > + "SELECT * FROM gen_to_nogen2 ORDER BY c"); > +is($result, qq(1|1||), > + 'gen_to_nogen2 initial sync, when publish_generated_columns=false'); > > IMO all the "result" queries like these ones ought to have to have a > comment which explains the reason for the expected results. This > review comment applies to multiple places. Please add comments to all > of them. > > ~~~ > > 7. > +# -------------------------------------------------- > +# Testcase: Publisher replicates the column list data excluding generated > +# columns even though publish_generated_columns option is false. > +# -------------------------------------------------- > + > > 7a. > This is the 2nd test case, but AFAICT it would be far easier to test > this scenario just by making another table (with an appropriate column > list) for the 1st test case. > > ~ > > 7b. > BTW, I don't understand this test at all. I thought according to the > comment that it intended to use a publication column list with only > normal columns in it. But that is not what the publication looks like > here: > + CREATE PUBLICATION pub1 FOR table nogen_to_gen, nogen_to_gen2(gen1) > WITH (publish_generated_columns=false); > > Indeed, the way it is currently written I didn't see what this test is > doing that is any different from the prior test (???) > > ~~~ > > 8. > +# -------------------------------------------------- > +# Testcase: Although publish_generated_columns is true, publisher publishes > +# only the data of the columns specified in column list, skipping other > +# generated/non-generated columns. > +# -------------------------------------------------- > > versus > > +# -------------------------------------------------- > +# Testcase: Publisher publishes only the data of the columns specified in > +# column list skipping other generated/non-generated columns. > +# -------------------------------------------------- > > Again, I did not understand how these test cases differ from each > other. Surely, those can be combined easily enough just by adding > another table with a different kind of column list. > > ~~~ > > 9. > +# -------------------------------------------------- > +# Testcase: Publisher replicates all columns if publish_generated_columns is > +# enabled and there is no column list > +# -------------------------------------------------- > + > > Here is yet another test case that AFAICT can just be combined with > other test cases that were using publish_generated_columns=true. It > seems all you need is one extra table with no column list. You don't > need all the extra create/drop pub/sub overheads to test this. > > ====== I have fixed all the comments and posted the v39 patches for them. Please refer to the updated v39 Patches here in [1]. See [1] for the changes added. [1] https://www.postgresql.org/message-id/CAHv8RjLjb%2B98i5ZQUphivxdOZ3hSGLfq2SiWQetUvk8zGyAQwQ%40mail.gmail.com Thanks and Regards, Shubham Khanna.
pgsql-hackers by date: