Re: pg_upgrade and logical replication - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: pg_upgrade and logical replication |
Date | |
Msg-id | CAHut+PtWdShmS9iF6OWM_O6Z_7yijzHiHdwP58cZVuEEs2Y9Qg@mail.gmail.com Whole thread Raw |
In response to | Re: pg_upgrade and logical replication (vignesh C <vignesh21@gmail.com>) |
Responses |
Re: pg_upgrade and logical replication
|
List | pgsql-hackers |
Here are some review comments for patch v11-0001 ====== Commit message 1. The subscription's replication origin are needed to ensure that we don't replicate anything twice. ~ /are needed/is needed/ ~~~ 2. Author: Julien Rouhaud Reviewed-by: FIXME Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud ~ Include Vignesh as another author. ====== doc/src/sgml/ref/pgupgrade.sgml 3. + <application>pg_upgrade</application> attempts to migrate subscription + dependencies which includes the subscription tables information present in + <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link> + system table and the subscription replication origin which + will help in continuing logical replication from where the old subscriber + was replicating. This helps in avoiding the need for setting up the I became a bit lost reading paragraph due to the multiple 'which'... SUGGESTION pg_upgrade attempts to migrate subscription dependencies which includes the subscription table information present in pg_subscription_rel system catalog and also the subscription replication origin. This allows logical replication on the new subscriber to continue from where the old subscriber was up to. ~~~ 4. + was replicating. This helps in avoiding the need for setting up the + subscription objects manually which requires truncating all the + subscription tables and setting the logical replication slots. Migration SUGGESTION Having the ability to migrate subscription objects avoids the need to set them up manually, which would require truncating all the subscription tables and setting the logical replication slots. ~ TBH, I am wondering what is the purpose of this sentence. It seems more like a justification for the patch, but does the user need to know all this? ~~~ 5. + <para> + All the subscription tables in the old subscriber should be in + <literal>i</literal> (initialize), <literal>r</literal> (ready) or + <literal>s</literal> (synchronized). This can be verified by checking + <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>. + </para> /should be in/should be in state/ ~~~ 6. + <para> + The replication origin entry corresponding to each of the subscriptions + should exist in the old cluster. This can be checking + <link linkend="catalog-pg-subscription">pg_subscription</link> and + <link linkend="catalog-pg-replication-origin">pg_replication_origin</link> + system tables. + </para> missing words? /This can be checking/This can be found by checking/ ~~~ 7. + <para> + The subscriptions will be migrated to new cluster in disabled state, they + can be enabled after upgrade by following the steps: + </para> The first bullet also says "Enable the subscription..." so I think this paragraph should be worded like the below. SUGGESTION The subscriptions will be migrated to the new cluster in a disabled state. After migration, do this: ====== src/backend/catalog/pg_subscription.c 8. #include "nodes/makefuncs.h" +#include "replication/origin.h" +#include "replication/worker_internal.h" #include "storage/lmgr.h" Why does this change need to be in the patch when there are no other code changes in this file? ====== src/backend/utils/adt/pg_upgrade_support.c 9. binary_upgrade_create_sub_rel_state IMO a better name for this function would be 'binary_upgrade_add_sub_rel_state' (because it delegates to AddSubscriptionRelState). Then it would obey the same name pattern as the other function 'binary_upgrade_replorigin_advance' (which delegates to replorigin_advance). ~~~ 10. +/* + * binary_upgrade_create_sub_rel_state + * + * Add the relation with the specified relation state to pg_subscription_rel + * table. + */ +Datum +binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS) +{ + Relation rel; + HeapTuple tup; + Oid subid; + Form_pg_subscription form; + char *subname; + Oid relid; + char relstate; + XLogRecPtr sublsn; 10a. /to pg_subscription_rel table./to pg_subscription_rel catalog./ ~ 10b. Maybe it would be helpful if the function argument were documented up-front in the function-comment, or in the variable declarations. SUGGESTION char *subname; /* ARG0 = subscription name */ Oid relid; /* ARG1 = relation Oid */ char relstate; /* ARG2 = subrel state */ XLogRecPtr sublsn; /* ARG3 (optional) = subscription lsn */ ~~~ 11. if (PG_ARGISNULL(3)) sublsn = InvalidXLogRecPtr; else sublsn = PG_GETARG_LSN(3); FWIW, I'd write that as a one-line ternary assignment allowing all the args to be grouped nicely together. SUGGESTION sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3); ~~~ 12. binary_upgrade_replorigin_advance /* * binary_upgrade_replorigin_advance * * Update the remote_lsn for the subscriber's replication origin. */ Datum binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS) { Relation rel; HeapTuple tup; Oid subid; Form_pg_subscription form; char *subname; XLogRecPtr sublsn; char originname[NAMEDATALEN]; RepOriginId originid; ~ Similar to previous comment #10b. Maybe it would be helpful if the function argument were documented up-front in the function-comment, or in the variable declarations. SUGGESTION char originname[NAMEDATALEN]; RepOriginId originid; char *subname; /* ARG0 = subscription name */ XLogRecPtr sublsn; /* ARG1 = subscription lsn */ ~~~ 13. + subname = text_to_cstring(PG_GETARG_TEXT_PP(0)); + + if (PG_ARGISNULL(1)) + sublsn = InvalidXLogRecPtr; + else + sublsn = PG_GETARG_LSN(1); Similar to previous comment #11. FWIW, I'd write that as a one-line ternary assignment allowing all the args to be grouped nicely together. SUGGESTION subname = text_to_cstring(PG_GETARG_TEXT_PP(0)); sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1); ====== src/bin/pg_dump/pg_dump.c 14. getSubscriptionTables +/* + * getSubscriptionTables + * get information about subscription membership for dumpable tables, this + * will be used only in binary-upgrade mode. + */ Should use multiple sentences. SUGGESTION Get information about subscription membership for dumpable tables. This will be used only in binary-upgrade mode. ~~~ 15. + /* Get subscription relation fields */ + i_srsubid = PQfnumber(res, "srsubid"); + i_srrelid = PQfnumber(res, "srrelid"); + i_srsubstate = PQfnumber(res, "srsubstate"); + i_srsublsn = PQfnumber(res, "srsublsn"); Might it be better to say "Get pg_subscription_rel attributes"? ~~~ 16. getSubscriptions + appendPQExpBufferStr(query, "o.remote_lsn\n"); appendPQExpBufferStr(query, "FROM pg_subscription s\n" + "LEFT JOIN pg_replication_origin_status o \n" + " ON o.external_id = 'pg_' || s.oid::text \n" "WHERE s.subdbid = (SELECT oid FROM pg_database\n" " WHERE datname = current_database())"); ~ 16a. Should that "remote_lsn" have an alias like "suboriginremotelsn" so that it matches the later field assignment better? ~ 16b. Probably these catalogs should be qualified using "pg_catalog.". ~~~ 17. dumpSubscriptionTable +/* + * dumpSubscriptionTable + * dump the definition of the given subscription table mapping, this will be + * used only for upgrade operation. + */ Make this comment consistent with the other one for getSubscriptionTables: - split into multiple sentences - use the same terminology "binary-upgrade mode" versus "upgrade operation'. ~~~ 18. + /* + * binary_upgrade_create_sub_rel_state will add the subscription + * relation to pg_subscripion_rel table, this is supported only for + * upgrade operation. + */ Split into multiple sentences. ====== src/bin/pg_dump/pg_dump_sort.c 19. + case DO_SUBSCRIPTION_REL: + snprintf(buf, bufsize, + "SUBSCRIPTION TABLE (ID %d)", + obj->dumpId); + return; Should it include the OID (like for DO PUBLICATION_TABLE)? ====== src/bin/pg_upgrade/check.c 20. check_for_reg_data_type_usage(&old_cluster); check_for_isn_and_int8_passing_mismatch(&old_cluster); + check_for_subscription_state(&old_cluster); + There seems no reason anymore for this check to be separated from all the other checks. Just remove the blank line. ~~~ 21. check_for_subscription_state +/* + * check_for_subscription_state() + * + * Verify that each of the subscriptions have all their corresponding tables in + * ready state. + */ +static void +check_for_subscription_state(ClusterInfo *cluster) /have/has/ This comment only refers to 'ready' state, but perhaps it is misleading (or not entirely correct) because later the SQL is testing for more than just the READY state: + "WHERE srsubstate NOT IN ('i', 's', 'r') " ~~~ 22. + res = executeQueryOrDie(conn, + "SELECT s.subname, c.relname, n.nspname " + "FROM pg_catalog.pg_subscription_rel r " + "LEFT JOIN pg_catalog.pg_subscription s" + " ON r.srsubid = s.oid " + "LEFT JOIN pg_catalog.pg_class c" + " ON r.srrelid = c.oid " + "LEFT JOIN pg_catalog.pg_namespace n" + " ON c.relnamespace = n.oid " + "WHERE srsubstate NOT IN ('i', 's', 'r') " + "ORDER BY s.subname"); If you are going to check 'i', 's', and 'r' then I thought this statement should maybe have some comment about why those states. ~~~ 23. + pg_fatal("Your installation contains subscription(s) with\n" + "Subscription not having origin and/or subscription relation(s) not in ready state.\n" + "A list of subscription not having origin and/or\n" + "subscription relation(s) not in ready state is in the file: %s", + output_path); 23a. This message seems to just be saying the same thing 2 times. Is also should use newlines and spaces more like the other similar pg_patals in this file (e.g. the %s is on next line etc). SUGGESTION Your installation contains subscriptions without origin or having relations not in a ready state.\n A list of the problem subscriptions is in the file:\n %s ~ 23b. Same question about 'not in ready state'. Is that entirely correct? ====== src/bin/pg_upgrade/t/004_subscription.pl 24. +sub insert_line +{ + my $payload = shift; + + foreach ("t1", "t2") + { + $publisher->safe_psql('postgres', + "INSERT INTO " . $_ . " (val) VALUES('$payload')"); + } +} For clarity, maybe call this function 'insert_line_at_pub' ~~~ 25. +# ------------------------------------------------------ +# Check that pg_upgrade is succesful when all tables are in ready state. +# ------------------------------------------------------ /succesful/successful/ ~~~ 26. +command_ok( + [ + 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir, + '-D', $new_sub->data_dir, '-b', $bindir, + '-B', $bindir, '-s', $new_sub->host, + '-p', $old_sub->port, '-P', $new_sub->port, + $mode, '--check', + ], + 'run of pg_upgrade --check for old instance with invalid remote_lsn'); This is the command for the "success" case. Why is the message part referring to "invalid remote_lsn"? ~~~ 27. +$publisher->safe_psql('postgres', + "CREATE TABLE tab_primary_key(id serial, val text);"); +$old_sub->safe_psql('postgres', + "CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);"); +$publisher->safe_psql('postgres', Maybe it is not necessary, but won't it be better if the publisher table also has a primary key (so DDL matches its table name)? ~~~ 28. +# Add a row in subscriber so that the table sync will fail. +$old_sub->safe_psql('postgres', + "INSERT INTO tab_primary_key values(1, 'before initial sync')"); The comment should be slightly more descriptive by saying the reason it will fail is that you deliberately inserted the same PK value again. ~~~ 29. +my $started_query = + "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';"; +$old_sub->poll_query_until('postgres', $started_query) + or die "Timed out while waiting for subscriber to synchronize data"; Since this cannot synchronize the table data, maybe the message should be more like "Timed out while waiting for the table state to become 'd' (datasync)" ~~~ 30. +command_fails( + [ + 'pg_upgrade', '--no-sync', '-d', $old_sub->data_dir, + '-D', $new_sub->data_dir, '-b', $bindir, + '-B', $bindir, '-s', $new_sub->host, + '-p', $old_sub->port, '-P', $new_sub->port, + $mode, '--check', + ], + 'run of pg_upgrade --check for old instance with incorrect sub rel'); /with incorrect sub rel/with incorrect sub rel state/ (??) ~~~ 31. +# ------------------------------------------------------ +# Check that pg_upgrade doesn't detect any problem once all the subscription's +# relation are in 'r' (ready) state. +# ------------------------------------------------------ 31a. /relation/relations/ ~ 31b. Do you think that comment is correct? All you are doing here is allowing the old_sub to proceed because there is no longer any conflict -- but isn't that just normal pub/sub behaviour that has nothing to do with pg_upgrade? ~~~ 32. +# Stop the old subscriber, insert a row in each table while it's down and add +# t2 to the publication /in each table/in each publisher table/ Also, it is not each table -- it's only t1 and t2; not tab_primary_key. ~~~ 33. + $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel"); +is($result, qq(2), "There should be 2 rows in pg_subscription_rel"); /2 rows in pg_subscription_rel/2 rows in pg_subscription_rel (representing t1 and tab_primary_key)/ ====== 34. binary_upgrade_create_sub_rel_state +{ oid => '8404', descr => 'for use by pg_upgrade (relation for pg_subscription_rel)', + proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f', + provolatile => 'v', proparallel => 'u', prorettype => 'void', + proargtypes => 'text oid char pg_lsn', + prosrc => 'binary_upgrade_create_sub_rel_state' }, As mentioned in a previous review comment #9, I felt this function should have a different name: binary_upgrade_add_sub_rel_state. ====== Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: