Re: speed up a logical replica setup - Mailing list pgsql-hackers

From Euler Taveira
Subject Re: speed up a logical replica setup
Date
Msg-id d546c4bb-92d1-4e2d-898f-48234b12ed25@app.fastmail.com
Whole thread Raw
In response to Re: speed up a logical replica setup  (Peter Eisentraut <peter@eisentraut.org>)
Responses Re: speed up a logical replica setup
Re: speed up a logical replica setup
List pgsql-hackers
On Mon, Mar 18, 2024, at 10:52 AM, Peter Eisentraut wrote:
On 16.03.24 16:42, Euler Taveira wrote:
> I'm attaching a new version (v30) that adds:

I have some review comments and attached a patch with some smaller 
fixups (mainly message wording and avoid fixed-size string buffers).

Thanks for your review. I'm attaching a new version (v32) that includes your
fixups, merges the v30-0002 into the main patch [1], addresses Hayato's review[2],
your reviews [3][4], and fixes the query for set_replication_progress() [5].

* doc/src/sgml/ref/pg_createsubscriber.sgml

I would remove the "How It Works" section.  This is not relevant to
users, and it is very detailed and will require updating whenever the
implementation changes.  It could be a source code comment instead.

It uses the same structure as pg_rewind that also describes how it works
internally. I included a separate patch that completely removes the section.

* src/bin/pg_basebackup/pg_createsubscriber.c

I think the connection string handling is not robust against funny
characters, like spaces, in database names etc.

get_base_conninfo() uses PQconninfoParse to parse the connection string. I
expect PQconnectdb to provide a suitable error message in this case. Even if it
builds keywords and values arrays, it is also susceptible to the same issue, no?

Most SQL commands need to be amended for proper identifier or string
literal quoting and/or escaping.

I completely forgot about this detail when I added the new options in v30. It is
fixed now. I also changed the tests to exercise it.

In check_subscriber(): All these permissions checks seem problematic
to me.  We shouldn't reimplement our own copy of the server's
permission checks.  The server can check the permissions.  And if the
permission checking in the server ever changes, then we have
inconsistencies to take care of.  Also, the error messages "permission
denied" are inappropriate, because we are not doing the actual thing.
Maybe we want to do a dry-run for the benefit of the user, but then we
should do the actual thing, like try to create a replication slot, or
whatever.  But I would rather just remove all this, it seems too
problematic.

The main goal of the check_* functions are to minimize error during execution.
I removed the permission checks. The GUC checks were kept.

In main(): The first check if the standby is running is problematic.
I think it would be better to require that the standby is initially
shut down.  Consider, the standby might be running under systemd.
This tool will try to stop it, systemd will try to restart it.  Let's
avoid these kinds of battles.  It's also safer if we don't try to
touch running servers.

That's a good point. I hadn't found an excuse to simplify this but you provided
one. :) There was a worry about ignoring some command-line options that changes
GUCs if the server was started. There was also an ugly case for dry run mode
that has to start the server (if it was running) at the end. Both cases are no
longer issues. The current code provides a suitable error if the target server
is running.

The -p option (--subscriber-port) doesn't seem to do anything.  In my
testing, it always uses the compiled-in default port.

It works for me. See this snippet from the regression tests. The port (50945) is
used by pg_ctl.

# Running: pg_createsubscriber --verbose --verbose --pgdata /c/pg_createsubscriber/src/bin/pg_basebackup/tmp_check/t_040_pg_createsubscriber_node_s_data/pgdata --publisher-server port=50943 host=/tmp/qpngb0bPKo dbname='pg1' --socket-directory /tmp/qpngb0bPKo --subscriber-port 50945 --database pg1 --database pg2
pg_createsubscriber: validating connection string on publisher
.
.
pg_createsubscriber: pg_ctl command is: "/c/pg_createsubscriber/tmp_install/c/pg_createsubscriber/bin/pg_ctl" start -D "/c/pg_createsubscriber/src/bin/pg_basebackup/tmp_check/t_040_pg_createsubscriber_node_s_data/pgdata" -s -o "-p 50945" -o "-c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/tmp/qpngb0bPKo'"
2024-03-20 18:15:24.517 -03 [105195] LOG:  starting PostgreSQL 17devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2024-03-20 18:15:24.517 -03 [105195] LOG:  listening on Unix socket "/tmp/qpngb0bPKo/.s.PGSQL.50945"

Printing all the server log lines to the terminal doesn't seem very
user-friendly.  Not sure what to do about that, short of keeping a 
pg_upgrade-style directory of log files.  But it's ugly.

I removed the previous implementation that creates a new directory and stores
the log file there. I don't like the pg_upgrade-style directory because (a) it
stores part of the server log files in another place and (b) it is another
directory to ignore if your tool handles the data directory (like a backup
tool). My last test said it prints 35 server log lines. I expect that the user
redirects the output to a file so he/she can inspect it later if required.



--
Euler Taveira

Attachment

pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: New Table Access Methods for Multi and Single Inserts
Next
From: John Naylor
Date:
Subject: Re: add AVX2 support to simd.h