Thread: Enhance pg_createsubscriber to create required standby.

Enhance pg_createsubscriber to create required standby.

From
Shubham Khanna
Date:
Hi All,

Currently, pg_createsubscriber is designed to convert an existing
physical replica into a logical subscriber. To use it, the user must
manually set up a standby node beforehand, ensure that physical
replication is active, and only then run pg_createsubscriber to
perform the switchover to logical replication.
To simplify this workflow, I propose enhancing the pg_createsubscriber
utility to handle the creation of the standby node as part of the
pg_createsubscriber itself. This idea was also suggested earlier by
Andres Freund in [1].
To support this, I have introduced a new option: --create-standby
This allows the user to specify the parameters needed for
pg_basebackup to automatically create the standby node, establish
physical replication, and then proceed with the existing steps to
convert it into a logical subscriber. Following is the output that
creates the standby node:

./pg_createsubscriber -D standby/ -P "host=localhost port=5432
dbname=postgres" --create-standby --verbose
pg_createsubscriber: creating the standby server:
"/home/shubham/Project/Git/postgres/inst/bin/pg_basebackup"
-d'host=localhost port=5432 dbname=postgres' -D standby -P -X stream
-R
22881/22881 kB (100%), 1/1 tablespace
pg_createsubscriber: set standby port in standby/postgresql.auto.conf:
port = 50432
pg_createsubscriber: starting the standby server:
"/home/shubham/Project/Git/postgres/inst/bin/pg_ctl" -D "standby" -l
"standby/standby.log" -o "-p 50432" start
waiting for server to start.... done
server started
pg_createsubscriber: configuring the standby server:
"/home/shubham/Project/Git/postgres/inst/bin/psql" -d postgres -p
50432 -c "ALTER SYSTEM SET wal_level = 'logical';"
ALTER SYSTEM
pg_createsubscriber: configuring the standby server:
"/home/shubham/Project/Git/postgres/inst/bin/psql" -d postgres -p
50432 -c "ALTER SYSTEM SET listen_addresses = '*';"
ALTER SYSTEM
pg_createsubscriber: stopping the standby server:
"/home/shubham/Project/Git/postgres/inst/bin/pg_ctl" -D "standby" -l
"standby/standby.log" -o "-p 50432" stop
waiting for server to shut down.... done
server stopped
pg_createsubscriber: validating publisher connection string

Currently, pg_basebackup is executed with the following default parameters:
pg_basebackup <primary_node_connection_info> -D <sub_dir> -P -U
replication -X stream -R
If users wish to customize these parameters, a new option
pg_basebackup_parameters can be supported in the future.
Similarly, the standby node is created using the default
configuration, except for the port and wal_level settings. To
customize other aspects of the standby setup, a new option
standby_config can be supported in the future if required.

[1] - https://www.postgresql.org/message-id/20220221232849.x6s24ete4eyg6jol%40alap3.anarazel.de

Thanks and regards,
Shubham Khanna.

Attachment

Re: Enhance pg_createsubscriber to create required standby.

From
Amit Kapila
Date:
On Wed, Jun 4, 2025 at 2:39 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 04.06.25 05:56, Shubham Khanna wrote:
> > Currently, pg_createsubscriber is designed to convert an existing
> > physical replica into a logical subscriber. To use it, the user must
> > manually set up a standby node beforehand, ensure that physical
> > replication is active, and only then run pg_createsubscriber to
> > perform the switchover to logical replication.
> > To simplify this workflow, I propose enhancing the pg_createsubscriber
> > utility to handle the creation of the standby node as part of the
> > pg_createsubscriber itself.
>
> Yes, this was contemplated when pg_createsubscriber was first proposed.
> We opted against it, mainly for simplicity initially.
>
> It's not clear to me how this change now would substantially improve the
> user experience.  The number of characters you type is approximately the
> same.  You still need to support the old mode because the backup might
> not come from pg_basebackup.

In the current functionality, the user must first manually create a
standby or use an existing standby to make it a subscriber. I thought
saving this step for users would be quite helpful. It also helps
streamline the process into a single, cohesive workflow.

>
>  And then you'd have the maintenance
> overhead that every new feature in pg_basebackup would potentially have
> to be passed through or somehow be integrated into pg_createsubscriber.
>

I am not so sure about this because we use other utilities like pg_ctl
in this tool, so same argument could be build for it.

--
With Regards,
Amit Kapila.



Re: Enhance pg_createsubscriber to create required standby.

From
Peter Eisentraut
Date:
On 04.06.25 11:56, Amit Kapila wrote:
>> It's not clear to me how this change now would substantially improve the
>> user experience.  The number of characters you type is approximately the
>> same.  You still need to support the old mode because the backup might
>> not come from pg_basebackup.
> 
> In the current functionality, the user must first manually create a
> standby or use an existing standby to make it a subscriber. I thought
> saving this step for users would be quite helpful. It also helps
> streamline the process into a single, cohesive workflow.

Unless I'm missing something, doesn't this merely replace

pg_basebackup && pg_createsubscriber

with

pg_createsubscriber --create-standby

I mean, as I'm typing this out, this is literally the same number of 
characters.  Is the second one easier somehow?  It's not clear.

>>   And then you'd have the maintenance
>> overhead that every new feature in pg_basebackup would potentially have
>> to be passed through or somehow be integrated into pg_createsubscriber.
> 
> I am not so sure about this because we use other utilities like pg_ctl
> in this tool, so same argument could be build for it.

Yes, and we tried really hard to avoid the dependency on pg_ctl, but it 
was too hard.  (I would entertain patches to get rid of it.)




Re: Enhance pg_createsubscriber to create required standby.

From
vignesh C
Date:
On Thu, 5 Jun 2025 at 01:50, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 04.06.25 11:56, Amit Kapila wrote:
> >> It's not clear to me how this change now would substantially improve the
> >> user experience.  The number of characters you type is approximately the
> >> same.  You still need to support the old mode because the backup might
> >> not come from pg_basebackup.
> >
> > In the current functionality, the user must first manually create a
> > standby or use an existing standby to make it a subscriber. I thought
> > saving this step for users would be quite helpful. It also helps
> > streamline the process into a single, cohesive workflow.
>
> Unless I'm missing something, doesn't this merely replace
>
> pg_basebackup && pg_createsubscriber
>
> with
>
> pg_createsubscriber --create-standby
>
> I mean, as I'm typing this out, this is literally the same number of
> characters.  Is the second one easier somehow?  It's not clear.

I believe adding support for standby creation directly within
pg_createsubscriber offers several key advantages over the manual
two-step approach of running pg_basebackup followed by
pg_createsubscriber. The main benefits include:
1) Improved Progress Reporting: pg_createsubscriber can be enhanced to
provide a clear, step-by-step progress indicator that shows both the
total number of steps and the current step being executed. For
example:
[Step 1/5] Executing pg_basebackup...
[Step 2/5] Validating prerequisites...
[Step 3/5] Creating publication and replication slot...
[Step 4/5] Waiting for standby to reach consistent state...
[Step 5/5] Creating subscription...
2) Graceful Handling of Interruptions: pg_createsubscriber can be
enhanced to handle user interruptions (e.g., Ctrl+C) more gracefully
by cleaning up any partially created directories, such as the base
backup target directory. This can be achieved by registering an atexit
handler for safe cleanup during early termination.
3) Cleanup on Partial Failure: If pg_basebackup completes but a
subsequent step (e.g., subscription or replication slot creation)
fails, pg_createsubscriber could offer the option to automatically
clean up the standby contents. In contrast, with the manual
pg_basebackup && pg_createsubscriber approach, failures after the
backup step leave residual data on disk, as there’s no built-in
cleanup mechanism.
4) Simplified Option Handling: The --create-standby option streamlines
the workflow by removing the need to manually pass options like -D and
-R. With this option, the user can simply run:
pg_createsubscriber -D <target_dir> -P <primary_connstr>  -d
<target_database> --create-standby
In the combined commands approach:
#primary is on the different server
pg_basebackup -D <target_dir> -d <primary_connstr> -R &&
pg_createsubscriber -d <target_database> -D <target_dir> -P
<primary_connstr>
#primary is on the same server
pg_basebackup -D <target_dir> -R && pg_createsubscriber -d
<target_database> -D <target_dir> -P <primary_connstr>
The integrated --create-standby option reduces the risk of user error,
misconfiguration, and is shorter to write for users.

Based on the above points I felt the create-standby option through
pg_createsubscriber will be helpful to users. Your thoughts?

Regards,
Vignesh



Re: Enhance pg_createsubscriber to create required standby.

From
Amit Kapila
Date:
On Thu, Jun 5, 2025 at 1:50 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 04.06.25 11:56, Amit Kapila wrote:
> >> It's not clear to me how this change now would substantially improve the
> >> user experience.  The number of characters you type is approximately the
> >> same.  You still need to support the old mode because the backup might
> >> not come from pg_basebackup.
> >
> > In the current functionality, the user must first manually create a
> > standby or use an existing standby to make it a subscriber. I thought
> > saving this step for users would be quite helpful. It also helps
> > streamline the process into a single, cohesive workflow.
>
> Unless I'm missing something, doesn't this merely replace
>
> pg_basebackup && pg_createsubscriber
>
> with
>
> pg_createsubscriber --create-standby
>
> I mean, as I'm typing this out, this is literally the same number of
> characters.  Is the second one easier somehow?  It's not clear.
>

As shown in Vignesh's email [1] (point 4), there could be multiple
additional parameters required for the first option suggested by you,
which will make it longer. Additionally, there are some other benefits
of having the second option (pg_createsubscriber --create-standby),
like better cleanup of contents during failures and better progress
reporting. Are you still against adding such an option?

[1]: https://www.postgresql.org/message-id/CALDaNm1biZBMOzFMfHYzqrAeosJSD5YRG%3D82-pp6%2BJhALsfe6w%40mail.gmail.com

--
With Regards,
Amit Kapila.



Re: Enhance pg_createsubscriber to create required standby.

From
"David G. Johnston"
Date:
On Tue, Jun 17, 2025 at 9:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

As shown in Vignesh's email [1] (point 4), there could be multiple
additional parameters required for the first option suggested by you,
which will make it longer. Additionally, there are some other benefits
of having the second option (pg_createsubscriber --create-standby),
like better cleanup of contents during failures and better progress
reporting. Are you still against adding such an option?

[1]: https://www.postgresql.org/message-id/CALDaNm1biZBMOzFMfHYzqrAeosJSD5YRG%3D82-pp6%2BJhALsfe6w%40mail.gmail.com


None of those benefits convince me that "let's write a shell script in C and put it under an annual feature release policy" is the way to go here.

Let's make something like this available to the community, say on the Wiki, and make it work in PostgreSQL 18 so they can benefit from it today, and readily use it as a basis to tweak things for their own unique circumstances.

David J.