RE: speed up a logical replica setup - Mailing list pgsql-hackers
From | Hayato Kuroda (Fujitsu) |
---|---|
Subject | RE: speed up a logical replica setup |
Date | |
Msg-id | TYCPR01MB12077E2ACB82680CD7BAA22F9F52D2@TYCPR01MB12077.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: speed up a logical replica setup ("Euler Taveira" <euler@eulerto.com>) |
Responses |
Re: speed up a logical replica setup
Re: speed up a logical replica setup |
List | pgsql-hackers |
Dear Euler, Thanks for updating the patch. Here are my comments. I used Grammarly to proofread sentences. (The tool strongly recommends to use active voice, but I can ignore for now) 01. "After a successful run, the state of the target server is analagous to a fresh logical replication setup." a/analagous/analogous 02. "The main difference between the logical replication setup and pg_createsubscriber is the initial data copy." Grammarly suggests: "The initial data copy is the main difference between the logical replication setup and pg_createsubscriber." 03. "Only the synchronization phase is done, which ensures each table is brought up to a synchronized state." This sentence is not very clear to me. How about: "pg_createsubscriber does only the synchronization phase, ensuring each table's replication state is ready." 04. "The pg_createsubscriber targets large database systems because most of the execution time is spent making the initial data copy." Hmm, but initial data sync by logical replication also spends most of time to make the initial data copy. IIUC bottlenecks are a) this application must stop and start server several times, and b) only the serial copy works. Can you clarify them? 05. It is better to say the internal difference between pg_createsubscriber and the initial sync by logical replication. For example: pg_createsubscriber uses a physical replication mechanism to ensure the standby catches up until a certain point. Then, it converts to the standby to the subscriber by promoting and creating subscriptions. 06. "If these are not met an error will be reported." Grammarly suggests: "If these are not met, an error will be reported." 07. "The given target data directory must have the same system identifier than the source data directory." Grammarly suggests: "The given target data directory must have the same system identifier as the source data directory." 08. "If a standby server is running on the target data directory or it is a base backup from the source data directory, system identifiers are the same." This line is not needed if bullet-style is not used. The line is just a supplement, not prerequisite. 09. "The source server must accept connections from the target server. The source server must not be in recovery." Grammarly suggests: "The source server must accept connections from the target server and not be in recovery." 10. "Publications cannot be created in a read-only cluster." Same as 08, this line is not needed if bullet-style is not used. 11. "pg_createsubscriber usually starts the target server with different connection settings during the transformation steps. Hence, connections to target server might fail." Grammarly suggests: "pg_createsubscriber usually starts the target server with different connection settings during transformation. Hence, connections to the target server might fail." 12. "During the recovery process," Grammarly suggests: "During recovery," 13. "replicated so an error would occur." Grammarly suggests: "replicated, so an error would occur." 14. "It would avoid situations in which WAL files from the source server might be used by the target server." Grammarly suggests: "It would avoid situations in which the target server might use WAL files from the source server." 15. "a LSN" s/a/an 16. "of write-ahead" s/of/of the/ 17. "specifies promote" We can do double-quote for the word promote. 18. "are also added so it avoids" Grammarly suggests: "are added to avoid" 19. "is accepting read-write transactions" Grammarly suggests: "accepts read-write transactions" 20. New options must be also documented as well. This helps not only users but also reviewers. (Sometimes we cannot identify that the implementation is intentinal or not.) 21. Also, not sure the specification is good. I preferred to specify them by format string. Because it can reduce the number of arguments and I cannot find use cases which user want to control the name of objects. However, your approach has a benefit which users can easily identify the generated objects by pg_createsubscriber. How do other think? 22. ``` #define BASE_OUTPUT_DIR "pg_createsubscriber_output.d" ``` No one refers the define. 23. ``` } CreateSubscriberOptions; ... } LogicalRepInfo; ``` Declarations after the "{" are not needed, because we do not do typedef. 22. While seeing definitions of functions, I found that some pointers are declared as const, but others are not. E.g., "char *lsn" in setup_recovery() won' be changed but not the constant. Is it just missing or is there another rule? 23. ``` static int num_dbs = 0; static int num_pubs = 0; static int num_subs = 0; static int num_replslots = 0; ``` I think the name is bit confusing. The number of generating publications/subscriptions/replication slots are always same as the number of databases. They just indicate the number of specified. My idea is num_custom_pubs or something. Thought? 24. ``` /* standby / subscriber data directory */ static char *subscriber_dir = NULL; ``` It is bit strange that only subscriber_dir is a global variable. Caller requires the CreateSubscriberOptions as an argument, except cleanup_objects_atexit() and main. So, how about makeing `CreateSubscriberOptions opt` to global one? 25. ``` * Replication slots, publications and subscriptions are created. Depending on * the step it failed, it should remove the already created objects if it is * possible (sometimes it won't work due to a connection issue). ``` I think it should be specified here that subscriptions won't be removed with the reason. 26. ``` /* * If the server is promoted, there is no way to use the current setup * again. Warn the user that a new replication setup should be done before * trying again. */ ``` Per comment 25, we can add a reference like "See comments atop the function" 27. usage() was not updated based on recent changes. 28. ``` if (strcmp(conn_opt->keyword, "dbname") == 0 && conn_opt->val != NULL) { if (dbname) *dbname = pg_strdup(conn_opt->val); continue; } ``` There is a memory-leak if multiple dbname are specified in the conninfo. 29. ``` pg_prng_seed(&prng_state, (uint64) (getpid() ^ time(NULL))); ``` No need to initialize the seed every time. Can you reuse pg_prng_state? 30. ``` if (num_replslots == 0) dbinfo[i].replslotname = pg_strdup(genname); ``` I think the straightforward way is to use the name of subscription if no name is specified. This follows the rule for CREATE SUBSCRIPTION. 31. ``` /* Create replication slot on publisher */ if (lsn) pg_free(lsn); ``` I think allocating/freeing memory is not so efficient. Can we add a flag to create_logical_replication_slot() for controlling the returning value (NULL or duplicated string)? We can use the condition (i == num_dbs-1) as flag. 32. ``` /* * Close the connection. If exit_on_error is true, it has an undesired * condition and it should exit immediately. */ static void disconnect_database(PGconn *conn, bool exit_on_error) ``` In case of disconnect_database(), the second argument should have different name. If it is true, the process exits unconditionally. Also, comments atop the function must be fixed. 33. ``` wal_level = strdup(PQgetvalue(res, 0, 0)); ``` pg_strdup should be used here. 34. ``` {"config-file", required_argument, NULL, 1}, {"publication", required_argument, NULL, 2}, {"replication-slot", required_argument, NULL, 3}, {"subscription", required_argument, NULL, 4}, ``` The ordering looks strange for me. According to pg_upgarade and pg_basebackup, options which do not have short notation are listed behind. 35. ``` opt.sub_port = palloc(16); ``` Per other lines, pg_alloc() should be used. 36. ``` pg_free(opt.sub_port); ``` You said that the leak won't be concerned here. If so, why only 'p' has pg_free()? 37. ``` /* Register a function to clean up objects in case of failure */ atexit(cleanup_objects_atexit); ``` Sorry if we have already discussed. I think the registration can be moved just before the boot of the standby. Before that, the callback will be no-op. 38. ``` /* Subscriber PID file */ snprintf(pidfile, MAXPGPATH, "%s/postmaster.pid", subscriber_dir); /* * If the standby server is running, stop it. Some parameters (that can * only be set at server start) are informed by command-line options. */ if (stat(pidfile, &statbuf) == 0) ``` Hmm. pidfile is used only here, but it is declared in main(). Can it be separated into another funtion like is_standby_started()? 39. Or, we may able to introcue "restart_standby_if_needed" or something. 40. ``` * XXX this code was extracted from BootStrapXLOG(). ``` So, can we extract the common part to somewhere? Since system identifier is related with the controldata file, I think it can be located in controldata_util.c. 41. You said like below in [1], but I could not find the related fix. Can you clarify? > That's a good point. We should state in the documentation that GUCs specified in > the command-line options are ignored during the execution. [1]: https://www.postgresql.org/message-id/40595e73-c7e1-463a-b8be-49792e870007%40app.fastmail.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/global/
pgsql-hackers by date: