Home > mailing lists

Re: speed up a logical replica setup - Mailing list pgsql-hackers

From	Euler Taveira
Subject	Re: speed up a logical replica setup
Date	March 21, 2024 04:19:10
Msg-id	d546c4bb-92d1-4e2d-898f-48234b12ed25@app.fastmail.com Whole thread Raw
In response to	Re: speed up a logical replica setup (Peter Eisentraut <peter@eisentraut.org>)
Responses	Re: speed up a logical replica setup Re: speed up a logical replica setup
List	pgsql-hackers

Tree view

On Mon, Mar 18, 2024, at 10:52 AM, Peter Eisentraut wrote:

On 16.03.24 16:42, Euler Taveira wrote:
> I'm attaching a new version (v30) that adds:

I have some review comments and attached a patch with some smaller
fixups (mainly message wording and avoid fixed-size string buffers).

Thanks for your review. I'm attaching a new version (v32) that includes your

fixups, merges the v30-0002 into the main patch [1], addresses Hayato's review[2],

your reviews [3][4], and fixes the query for set_replication_progress() [5].

* doc/src/sgml/ref/pg_createsubscriber.sgml

I would remove the "How It Works" section. This is not relevant to
users, and it is very detailed and will require updating whenever the
implementation changes. It could be a source code comment instead.

It uses the same structure as pg_rewind that also describes how it works

internally. I included a separate patch that completely removes the section.

* src/bin/pg_basebackup/pg_createsubscriber.c

I think the connection string handling is not robust against funny
characters, like spaces, in database names etc.

get_base_conninfo() uses PQconninfoParse to parse the connection string. I

expect PQconnectdb to provide a suitable error message in this case. Even if it

builds keywords and values arrays, it is also susceptible to the same issue, no?

Most SQL commands need to be amended for proper identifier or string
literal quoting and/or escaping.

I completely forgot about this detail when I added the new options in v30. It is

fixed now. I also changed the tests to exercise it.

In check_subscriber(): All these permissions checks seem problematic
to me. We shouldn't reimplement our own copy of the server's
permission checks. The server can check the permissions. And if the
permission checking in the server ever changes, then we have
inconsistencies to take care of. Also, the error messages "permission
denied" are inappropriate, because we are not doing the actual thing.
Maybe we want to do a dry-run for the benefit of the user, but then we
should do the actual thing, like try to create a replication slot, or
whatever. But I would rather just remove all this, it seems too
problematic.

The main goal of the check_* functions are to minimize error during execution.

I removed the permission checks. The GUC checks were kept.

In main(): The first check if the standby is running is problematic.
I think it would be better to require that the standby is initially
shut down. Consider, the standby might be running under systemd.
This tool will try to stop it, systemd will try to restart it. Let's
avoid these kinds of battles. It's also safer if we don't try to
touch running servers.

That's a good point. I hadn't found an excuse to simplify this but you provided

one. :) There was a worry about ignoring some command-line options that changes

GUCs if the server was started. There was also an ugly case for dry run mode

that has to start the server (if it was running) at the end. Both cases are no

longer issues. The current code provides a suitable error if the target server

is running.

The -p option (--subscriber-port) doesn't seem to do anything. In my
testing, it always uses the compiled-in default port.

It works for me. See this snippet from the regression tests. The port (50945) is

used by pg_ctl.

# Running: pg_createsubscriber --verbose --verbose --pgdata /c/pg_createsubscriber/src/bin/pg_basebackup/tmp_check/t_040_pg_createsubscriber_node_s_data/pgdata --publisher-server port=50943 host=/tmp/qpngb0bPKo dbname='pg1' --socket-directory /tmp/qpngb0bPKo --subscriber-port 50945 --database pg1 --database pg2

pg_createsubscriber: validating connection string on publisher

pg_createsubscriber: pg_ctl command is: "/c/pg_createsubscriber/tmp_install/c/pg_createsubscriber/bin/pg_ctl" start -D "/c/pg_createsubscriber/src/bin/pg_basebackup/tmp_check/t_040_pg_createsubscriber_node_s_data/pgdata" -s -o "-p 50945" -o "-c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/tmp/qpngb0bPKo'"

2024-03-20 18:15:24.517 -03 [105195] LOG: starting PostgreSQL 17devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit

2024-03-20 18:15:24.517 -03 [105195] LOG: listening on Unix socket "/tmp/qpngb0bPKo/.s.PGSQL.50945"

Printing all the server log lines to the terminal doesn't seem very
user-friendly. Not sure what to do about that, short of keeping a
pg_upgrade-style directory of log files. But it's ugly.

I removed the previous implementation that creates a new directory and stores

the log file there. I don't like the pg_upgrade-style directory because (a) it

stores part of the server log files in another place and (b) it is another

directory to ignore if your tool handles the data directory (like a backup

tool). My last test said it prints 35 server log lines. I expect that the user

redirects the output to a file so he/she can inspect it later if required.

[1] https://www.postgresql.org/message-id/34637e7f-0330-420d-8f45-1d022962d2fe%40app.fastmail.com

[2] https://www.postgresql.org/message-id/TYCPR01MB12077E2ACB82680CD7BAA22F9F52D2%40TYCPR01MB12077.jpnprd01.prod.outlook.com

[3] https://www.postgresql.org/message-id/085203c4-580d-4c50-90e3-e47249e14585%40eisentraut.org

[4] https://www.postgresql.org/message-id/7a970912-0b77-4942-84f7-2c9ca0bc05a5%40eisentraut.org

[5] https://www.postgresql.org/message-id/CALDaNm053p5Drh9%2B%3DVvkH0Zf_3xoLB%2Bxw8RsNZjEwhc5xc_W%3DQ%40mail.gmail.com

Euler Taveira