RE: speed up a logical replica setup - Mailing list pgsql-hackers
From | Hayato Kuroda (Fujitsu) |
---|---|
Subject | RE: speed up a logical replica setup |
Date | |
Msg-id | TYCPR01MB120776F5C9824042DD8BB1307F54F2@TYCPR01MB12077.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | RE: speed up a logical replica setup ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
List | pgsql-hackers |
Dear hackers, I've replied for trackability. > Further comments for v17. > > 01. > This program assumes that the target server has same major version with this. > Because the target server would be restarted by same version's pg_ctl command. > I felt it should be ensured by reading the PG_VERSION. Still investigating. > 02. > pg_upgrade checked the version of using executables, like pg_ctl, postgres, and > pg_resetwal. I felt it should be as well. Still investigating. > 03. get_bin_directory > ``` > if (find_my_exec(path, full_path) < 0) > { > pg_log_error("The program \"%s\" is needed by %s but was not > found in the\n" > "same directory as \"%s\".\n", > "pg_ctl", progname, full_path); > ``` > > s/"pg_ctl"/progname The message was updated. > 04. > Missing canonicalize_path()? I found find_my_exec() calls canonicalize_path(). No need to do. > 05. > Assuming that the target server is a cascade standby, i.e., it has a role as > another primary. In this case, I thought the child node would not work. Because > pg_createsubcriber runs pg_resetwal and all WAL files would be discarded at that > time. I have not tested, but should the program detect it and exit earlier? Still investigating. > 06. > wait_for_end_recovery() waits forever even if the standby has been disconnected > from the primary, right? should we check the status of the replication via > pg_stat_wal_receiver? Still investigating. > 07. > The cleanup function has couple of bugs. > > * If subscriptions have been created on the database, the function also tries to > drop a publication. But it leads an ERROR because it has been already dropped. > See setup_subscriber(). > * If the subscription has been created, drop_replication_slot() leads an ERROR. > Because the subscriber tried to drop the subscription while executing DROP > SUBSCRIPTION. Only drop_publication() was removed. > 08. > I found that all messages (ERROR, WARNING, INFO, etc...) would output to stderr, > but I felt it should be on stdout. Is there a reason? pg_dump outputs messages to > stderr, but the motivation might be to avoid confusion with dumps. Still investigating. > 09. > I'm not sure the cleanup for subscriber is really needed. Assuming that there > are two databases, e.g., pg1 pg2 , and we fail to create a subscription on pg2. > This can happen when the subscription which has the same name has been > already > created on the primary server. > In this case a subscirption pn pg1 would be removed. But what is a next step? > Since a timelineID on the standby server is larger than the primary (note that > the standby has been promoted once), we cannot resume the physical replication > as-is. IIUC the easiest method to retry is removing a cluster once and restarting > from pg_basebackup. If so, no need to cleanup the standby because it is > corrupted. > We just say "Please remove the cluster and recreate again". I still think it should be, but not done yet. New patch can be available in [1]. [1]: https://www.postgresql.org/message-id/TYCPR01MB12077A6BB424A025F04A8243DF54F2%40TYCPR01MB12077.jpnprd01.prod.outlook.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
pgsql-hackers by date: