Re: speed up a logical replica setup - Mailing list pgsql-hackers
From | Shubham Khanna |
---|---|
Subject | Re: speed up a logical replica setup |
Date | |
Msg-id | CAHv8Rj+AvtnxNyyMAFRVuuzP3Cy5j1zq99wkMUXOUnskPZg5nw@mail.gmail.com Whole thread Raw |
In response to | RE: speed up a logical replica setup ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Responses |
RE: speed up a logical replica setup
|
List | pgsql-hackers |
On Fri, Feb 2, 2024 at 3:11 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Euler, > > Thanks for updating the patch! > > > > I'm still working on the data structures to group options. I don't like the way > it was grouped in v13-0005. There is too many levels to reach database name. > The setup_subscriber() function requires the 3 data structures. > > > > Right, your refactoring looks fewer stack. So I pause to revise my refactoring > patch. > > > > The documentation update is almost there. I will include the modifications in > the next patch. > > > > OK. I think it should be modified before native speakers will attend to the > thread. > > > > Regarding v13-0004, it seems a good UI that's why I wrote a comment about it. > However, it comes with a restriction that requires a similar HBA rule for both > regular and replication connections. Is it an acceptable restriction? We might > paint ourselves into the corner. A reasonable proposal is not to remove this > option. Instead, it should be optional. If it is not provided, primary_conninfo > is used. > > > > I didn't have such a point of view. However, it is not related whether -P exists > or not. Even v14-0001 requires primary to accept both normal/replication connections. > If we want to avoid it, the connection from pg_createsubscriber can be restored > to replication-connection. > (I felt we do not have to use replication protocol even if we change the connection mode) > > The motivation why -P is not needed is to ensure the consistency of nodes. > pg_createsubscriber assumes that the -P option can connect to the upstream node, > but no one checks it. Parsing two connection strings may be a solution but be > confusing. E.g., what if some options are different? > I think using a same parameter is a simplest solution. > > And below part contains my comments for v14. > > 01. > ``` > char temp_replslot[NAMEDATALEN] = {0}; > ``` > > I found that no one refers the name of temporary slot. Can we remove the variable? > > 02. > ``` > CreateSubscriberOptions opt; > ... > memset(&opt, 0, sizeof(CreateSubscriberOptions)); > > /* Default settings */ > opt.subscriber_dir = NULL; > opt.pub_conninfo_str = NULL; > opt.sub_conninfo_str = NULL; > opt.database_names = (SimpleStringList) > { > NULL, NULL > }; > opt.retain = false; > opt.recovery_timeout = 0; > ``` > > Initialization by `CreateSubscriberOptions opt = {0};` seems enough. > All values are set to 0x0. > > 03. > ``` > /* > * Is the standby server ready for logical replication? > */ > static bool > check_subscriber(LogicalRepInfo *dbinfo) > ``` > > You said "target server must be a standby" in [1], but I cannot find checks for it. > IIUC, there are two approaches: > a) check the existence "standby.signal" in the data directory > b) call an SQL function "pg_is_in_recovery" > > 04. > ``` > static char *pg_ctl_path = NULL; > static char *pg_resetwal_path = NULL; > ``` > > I still think they can be combined as "bindir". > > 05. > > ``` > /* > * Write recovery parameters. > ... > WriteRecoveryConfig(conn, opt.subscriber_dir, recoveryconfcontents); > ``` > > WriteRecoveryConfig() writes GUC parameters to postgresql.auto.conf, but not > sure it is good. These settings would remain on new subscriber even after the > pg_createsubscriber. Can we avoid it? I come up with passing these parameters > via pg_ctl -o option, but it requires parsing output from GenerateRecoveryConfig() > (all GUCs must be allign like "-c XXX -c XXX -c XXX..."). > > 06. > ``` > static LogicalRepInfo *store_pub_sub_info(SimpleStringList dbnames, const char *pub_base_conninfo, const char *sub_base_conninfo); > ... > static void modify_subscriber_sysid(const char *pg_resetwal_path, CreateSubscriberOptions opt); > ... > static void wait_for_end_recovery(const char *conninfo, CreateSubscriberOptions opt); > ``` > > Functions arguments should not be struct because they are passing by value. > They should be a pointer. Or, for modify_subscriber_sysid and wait_for_end_recovery, > we can pass a value which would be really used. > > 07. > ``` > static char *get_base_conninfo(char *conninfo, char *dbname, > const char *noderole); > ``` > > Not sure noderole should be passed here. It is used only for the logging. > Can we output string before calling the function? > (The parameter is not needed anymore if -P is removed) > > 08. > The terminology is still not consistent. Some functions call the target as standby, > but others call it as subscriber. > > 09. > v14 does not work if the standby server has already been set recovery_target* > options. PSA the reproducer. I considered two approaches: > > a) raise an ERROR when these parameter were set. check_subscriber() can do it > b) overwrite these GUCs as empty strings. > > 10. > The execution always fails if users execute --dry-run just before. Because > pg_createsubscriber stops the standby anyway. Doing dry run first is quite normal > use-case, so current implementation seems not user-friendly. How should we fix? > Below bullets are my idea: > > a) avoid stopping the standby in case of dry_run: seems possible. > b) accept even if the standby is stopped: seems possible. > c) start the standby at the end of run: how arguments like pg_ctl -l should be specified? > > My top-up patches fixes some issues. > > v15-0001: same as v14-0001 > === experimental patches === > v15-0002: Use replication connections when we connects to the primary. > Connections to standby is not changed because the standby/subscriber > does not require such type of connection, in principle. > If we can accept connecting to subscriber with replication mode, > this can be simplified. > v15-0003: Remove -P and use primary_conninfo instead. Same as v13-0004 > v15-0004: Check whether the target is really standby. This is done by pg_is_in_recovery() > v15-0005: Avoid stopping/starting standby server in dry_run mode. > I.e., approach a). in #10 is used. > v15-0006: Overwrite recovery parameters. I.e., aproach b). in #9 is used. > > [1]: https://www.postgresql.org/message-id/b315c7da-7ab1-4014-a2a9-8ab6ae26017c%40app.fastmail.com While reviewing the v15 patches I discovered that subscription connection string has added a lot of options which are not required now: v15-0001 postgres=# select subname, subconninfo from pg_subscription; subname | subconninfo -------------------------------+------------------------------------------ pg_createsubscriber_5_1867633 | host=localhost port=5432 dbname=postgres (1 row) v15-0001+0002+0003 postgres=# select subname, subconninfo from pg_subscription; subname | subconninfo -------------------------------+------------------------------------------------------------------------ -------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------- ------------------------------ pg_createsubscriber_5_1895366 | user=shubham passfile='/home/shubham/.pgpass' channel_binding=prefer ho st=127.0.0.1 port=5432 sslmode=prefer sslcompression=0 sslcertmode=allow sslsni=1 ssl_min_protocol_versi on=TLSv1.2 gssencmode=disable krbsrvname=postgres gssdelegation=0 target_session_attrs=any load_balance_ hosts=disable dbname=postgres (1 row) Here, we can see that channel_binding, sslmode, sslcertmode, sslsni, gssencmode, krbsrvname, etc are getting included. This does not look intentional, we should keep the subscription connection same as in v15-0001. Thanks and Regards, Shubham Khanna. Thanks and Regards, Shubham Khanna.
pgsql-hackers by date: