Thread: Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.

On Fri, Jan 24, 2025 at 7:28 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Jan 24, 2025 at 8:14 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > On Thu, Jan 23, 2025 at 10:33 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Wed, Jan 22, 2025 at 7:29 PM Shubham Khanna
> > > <khannashubham1197@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I am writing to propose an enhancement to the pg_createsubscriber
> > > > utility that enables it to automatically fetch all non-template
> > > > databases from the publisher when no specific databases are specified
> > > > by the user. This was an open item from [1] that was planned for
> > > > future implementation. The attached patch has the changes for the
> > > > same.
> > >
> > > I think the feature will be useful, but UI might cause some unwanted
> > > results. If a user forgets to specify -d option, the utility will
> > > create subscriptions to all the databases, some of which may or may
> > > not have the publications. I think it's better to provide an option to
> > > specify all databases explicitly (e.g. --all-databases).
> > >
> >
> >
> > +1 better to be safe.
> >
> > Instead of a new switch, how about changing the --database switch to
> > accept a pattern (like pg_dump --schema does [1])
> >
> > Then "all databases" would be specified something like --database = *
> >
>
> WFM but that will be more work than what's in the patch.
>

OK, what if, instead of full pattern matching it could recognise just
one special dbname value of '*' (meaning "all")

So, "all databases" could still be specified as --database = *

The implementation would be almost no more work than the current
patch, while at the same time leaving it open to be extended as a
pattern if needed in the future. Or, is it too hacky?

======
Kind Regards,
Peter Smith.
Fujitsu Australia



On Tue, Jan 28, 2025 at 3:58 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> On Fri, Jan 24, 2025 at 7:28 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Fri, Jan 24, 2025 at 8:14 AM Peter Smith <smithpb2250@gmail.com> wrote:
> > >
> > > On Thu, Jan 23, 2025 at 10:33 PM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 22, 2025 at 7:29 PM Shubham Khanna
> > > > <khannashubham1197@gmail.com> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I am writing to propose an enhancement to the pg_createsubscriber
> > > > > utility that enables it to automatically fetch all non-template
> > > > > databases from the publisher when no specific databases are specified
> > > > > by the user. This was an open item from [1] that was planned for
> > > > > future implementation. The attached patch has the changes for the
> > > > > same.
> > > >
> > > > I think the feature will be useful, but UI might cause some unwanted
> > > > results. If a user forgets to specify -d option, the utility will
> > > > create subscriptions to all the databases, some of which may or may
> > > > not have the publications. I think it's better to provide an option to
> > > > specify all databases explicitly (e.g. --all-databases).
> > > >
> > >
> > >
> > > +1 better to be safe.
> > >
> > > Instead of a new switch, how about changing the --database switch to
> > > accept a pattern (like pg_dump --schema does [1])
> > >
> > > Then "all databases" would be specified something like --database = *
> > >
> >
> > WFM but that will be more work than what's in the patch.
> >
>
> OK, what if, instead of full pattern matching it could recognise just
> one special dbname value of '*' (meaning "all")
>
> So, "all databases" could still be specified as --database = *
>
> The implementation would be almost no more work than the current
> patch, while at the same time leaving it open to be extended as a
> pattern if needed in the future. Or, is it too hacky?

I don't remember any precedence here. pg_dump has pg_dumpall which
dumps all the databases, so they chose to create a separate binary. If
we go that route, I think we will be able to produce a more flexible
utility, like replication slot names per database, or per database
subscription settings etc. Maybe we should consider that option.

If we want to stick to --database= supporting a pattern looks better
than just a single special pattern *.

--
Best Wishes,
Ashutosh Bapat



On Tue, Jan 28, 2025 at 12:01 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> If we want to stick to --database= supporting a pattern looks better
> than just a single special pattern *.
>

This sounds reasonable to me as well. Note that the interaction of
other parameters like --replication-slot is not yet discussed. I think
if the number of slots given matches with the number of databases
fetched based on pattern matches then we can use them otherwise,
return the ERROR. The other option could be that we don't allow
options like --replication-slot along with pattern matching option.

--
With Regards,
Amit Kapila.



On Wed, Jan 29, 2025 at 4:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 28, 2025 at 12:01 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > If we want to stick to --database= supporting a pattern looks better
> > than just a single special pattern *.
> >
>
> This sounds reasonable to me as well. Note that the interaction of
> other parameters like --replication-slot is not yet discussed. I think
> if the number of slots given matches with the number of databases
> fetched based on pattern matches then we can use them otherwise,
> return the ERROR. The other option could be that we don't allow
> options like --replication-slot along with pattern matching option.
>

I have had second thoughts about my pattern idea. Now, I favour just
adding another --all-databases switch like Ashutosh had suggested [1]
in the first place.

I had overlooked the rules saying that the user is allowed to specify
*multiple* --publication or --replication-slot or --subscription name
switches, but when doing so they have to match the same number of
--database switches. Using a --dbname=pattern would be fraught with
complications. e.g. How can we know up-front how many databases the
dbname pattern will resolve to, and even in what order they get
resolved?

In hindsight, it would be much easier to just have one extra switch,
so the rules then are simple:

--all-databases  (here you CANNOT specify multiple
publication/replication-slot/subscription switches)
-database=dbname (here you CAN specify multiple
publication/replication-slot/subscription switches using the same
order and the same number as databases)

Also, --all-databases and --database switches cannot be specified at
the same time.

======
[1] https://www.postgresql.org/message-id/CAExHW5sQGie7bvS-q7YUYDM2BqYZ%3D%2BxqeqFUS%3DcZGjK_9pnVzQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia