Thread: Switch pg_basebackup to use -X stream instead of -X fetch by default?

Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Andres Freund
Date:
Hi,

currently pg_basebackup uses fetch mode when only -x is specified -
which imo isn't a very good thing to use due to the increased risk of
not fetching everything.
How about switching to stream mode for 9.5+?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Oskari Saarenmaa
Date:
On 25/08/14 14:35, Andres Freund wrote:
> currently pg_basebackup uses fetch mode when only -x is specified -
> which imo isn't a very good thing to use due to the increased risk of
> not fetching everything.
> How about switching to stream mode for 9.5+?

+1.  I was just wondering why it's not the default a few days ago.

/ Oskari




Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Magnus Hagander
Date:
On Mon, Aug 25, 2014 at 1:35 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> currently pg_basebackup uses fetch mode when only -x is specified -
> which imo isn't a very good thing to use due to the increased risk of
> not fetching everything.
> How about switching to stream mode for 9.5+?

I think the original reasons were to not change the default behaviour
with a new feature, and secondly because defaulting to -X requires two
replication connections rather than one.

I think the first reason is gone now, and the risk/damage of the two
connections is probably smaller than running out of WAL. -x is a good
default for smaller systems, but -X is a safer one for bigger ones. So
I agree that changing the default mode would make sense.


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Andres Freund
Date:
On 2014-08-26 18:40:27 +0200, Magnus Hagander wrote:
> On Mon, Aug 25, 2014 at 1:35 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > Hi,
> >
> > currently pg_basebackup uses fetch mode when only -x is specified -
> > which imo isn't a very good thing to use due to the increased risk of
> > not fetching everything.
> > How about switching to stream mode for 9.5+?
> 
> I think the original reasons were to not change the default behaviour
> with a new feature, and secondly because defaulting to -X requires two
> replication connections rather than one.

Right.

> I think the first reason is gone now, and the risk/damage of the two
> connections is probably smaller than running out of WAL.

Especially as that will fail pretty nearly immediately instead at the
end of the basebackup...

> -x is a good
> default for smaller systems, but -X is a safer one for bigger ones. So
> I agree that changing the default mode would make sense.

Cool.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Magnus Hagander
Date:
On Tue, Aug 26, 2014 at 6:51 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-08-26 18:40:27 +0200, Magnus Hagander wrote:
>> On Mon, Aug 25, 2014 at 1:35 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > Hi,
>> >
>> > currently pg_basebackup uses fetch mode when only -x is specified -
>> > which imo isn't a very good thing to use due to the increased risk of
>> > not fetching everything.
>> > How about switching to stream mode for 9.5+?
>>
>> I think the original reasons were to not change the default behaviour
>> with a new feature, and secondly because defaulting to -X requires two
>> replication connections rather than one.
>
> Right.
>
>> I think the first reason is gone now, and the risk/damage of the two
>> connections is probably smaller than running out of WAL.
>
> Especially as that will fail pretty nearly immediately instead at the
> end of the basebackup...

Yeah, I don't think the problem was actually pg_basebackup failing as
much as pg_basebackup getting in the way of regular replication
standbys. Which I think is also a smaller problem now, given that it's
a more "common thing" to do backups through replication protocol.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Peter Eisentraut
Date:
On 8/26/14 12:40 PM, Magnus Hagander wrote:
> I think the first reason is gone now, and the risk/damage of the two
> connections is probably smaller than running out of WAL. -x is a good
> default for smaller systems, but -X is a safer one for bigger ones. So
> I agree that changing the default mode would make sense.

I would seriously consider just removing one of the modes.  Having two
modes is complex enough, and then having different defaults in different
versions, and fuzzy recommendations like, it's better for "smaller
systems", it's quite confusing.

I don't think it's a fundamental problem to say, you need 2 connections
to use this feature.  (For example, you need a second connection to
issue a cancel request.  Nobody has ever complained about that.)




Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Andres Freund
Date:
On 2014-08-26 16:41:44 -0400, Peter Eisentraut wrote:
> On 8/26/14 12:40 PM, Magnus Hagander wrote:
> > I think the first reason is gone now, and the risk/damage of the two
> > connections is probably smaller than running out of WAL. -x is a good
> > default for smaller systems, but -X is a safer one for bigger ones. So
> > I agree that changing the default mode would make sense.
> 
> I would seriously consider just removing one of the modes.  Having two
> modes is complex enough, and then having different defaults in different
> versions, and fuzzy recommendations like, it's better for "smaller
> systems", it's quite confusing.

Happy with removing the option and just accepting -X for backward
compat.

> I don't think it's a fundamental problem to say, you need 2 connections
> to use this feature.  (For example, you need a second connection to
> issue a cancel request.  Nobody has ever complained about that.)

Well, replication connections are more limited in number than normal
connections... And cancel requests are very short lived.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Magnus Hagander
Date:
On Tue, Aug 26, 2014 at 10:46 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-08-26 16:41:44 -0400, Peter Eisentraut wrote:
>> On 8/26/14 12:40 PM, Magnus Hagander wrote:
>> > I think the first reason is gone now, and the risk/damage of the two
>> > connections is probably smaller than running out of WAL. -x is a good
>> > default for smaller systems, but -X is a safer one for bigger ones. So
>> > I agree that changing the default mode would make sense.
>>
>> I would seriously consider just removing one of the modes.  Having two
>> modes is complex enough, and then having different defaults in different
>> versions, and fuzzy recommendations like, it's better for "smaller
>> systems", it's quite confusing.
>
> Happy with removing the option and just accepting -X for backward
> compat.

Works for me - this is really the cleaner way of doing it...

If we do that, perhaps we should backpatch a deprecation notice into
the 9.4 docs?


>> I don't think it's a fundamental problem to say, you need 2 connections
>> to use this feature.  (For example, you need a second connection to
>> issue a cancel request.  Nobody has ever complained about that.)
>
> Well, replication connections are more limited in number than normal
> connections... And cancel requests are very short lived.

Yeah. But as long as we document it clearly, we should be OK I think.
And it's fairly clearly documented now - just need to be sure not to
remove that when changing the -x stuff.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Fujii Masao
Date:
On Wed, Aug 27, 2014 at 6:16 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Aug 26, 2014 at 10:46 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> On 2014-08-26 16:41:44 -0400, Peter Eisentraut wrote:
>>> On 8/26/14 12:40 PM, Magnus Hagander wrote:
>>> > I think the first reason is gone now, and the risk/damage of the two
>>> > connections is probably smaller than running out of WAL. -x is a good
>>> > default for smaller systems, but -X is a safer one for bigger ones. So
>>> > I agree that changing the default mode would make sense.
>>>
>>> I would seriously consider just removing one of the modes.  Having two
>>> modes is complex enough, and then having different defaults in different
>>> versions, and fuzzy recommendations like, it's better for "smaller
>>> systems", it's quite confusing.
>>
>> Happy with removing the option and just accepting -X for backward
>> compat.
>
> Works for me - this is really the cleaner way of doing it...

We cannot use -X stream with tar output format mode. So I'm afraid that
removing -X fetch would make people using tar output format feel disappointed.
Or we should make -X stream work with tar mode.

Regards,

-- 
Fujii Masao



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Magnus Hagander
Date:
On Wed, Aug 27, 2014 at 5:16 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Aug 27, 2014 at 6:16 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Tue, Aug 26, 2014 at 10:46 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>>> On 2014-08-26 16:41:44 -0400, Peter Eisentraut wrote:
>>>> On 8/26/14 12:40 PM, Magnus Hagander wrote:
>>>> > I think the first reason is gone now, and the risk/damage of the two
>>>> > connections is probably smaller than running out of WAL. -x is a good
>>>> > default for smaller systems, but -X is a safer one for bigger ones. So
>>>> > I agree that changing the default mode would make sense.
>>>>
>>>> I would seriously consider just removing one of the modes.  Having two
>>>> modes is complex enough, and then having different defaults in different
>>>> versions, and fuzzy recommendations like, it's better for "smaller
>>>> systems", it's quite confusing.
>>>
>>> Happy with removing the option and just accepting -X for backward
>>> compat.
>>
>> Works for me - this is really the cleaner way of doing it...
>
> We cannot use -X stream with tar output format mode. So I'm afraid that
> removing -X fetch would make people using tar output format feel disappointed.
> Or we should make -X stream work with tar mode.

Ah, yes, I've actually had that on my TODO for some time.

I think the easy way of doing that is to just create an xlog.tar file.
Since we already create "base.tar" and possibly n*<tablespace.tar>,
adding one more file shouldn't be a big problem, and would make such
an implementation much easier. Would be trivial to do .tar.gz for it
as well, just like for the others.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Robert Haas
Date:
On Wed, Aug 27, 2014 at 2:55 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Wed, Aug 27, 2014 at 5:16 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Wed, Aug 27, 2014 at 6:16 AM, Magnus Hagander <magnus@hagander.net> wrote:
>>> On Tue, Aug 26, 2014 at 10:46 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>>>> On 2014-08-26 16:41:44 -0400, Peter Eisentraut wrote:
>>>>> On 8/26/14 12:40 PM, Magnus Hagander wrote:
>>>>> > I think the first reason is gone now, and the risk/damage of the two
>>>>> > connections is probably smaller than running out of WAL. -x is a good
>>>>> > default for smaller systems, but -X is a safer one for bigger ones. So
>>>>> > I agree that changing the default mode would make sense.
>>>>>
>>>>> I would seriously consider just removing one of the modes.  Having two
>>>>> modes is complex enough, and then having different defaults in different
>>>>> versions, and fuzzy recommendations like, it's better for "smaller
>>>>> systems", it's quite confusing.
>>>>
>>>> Happy with removing the option and just accepting -X for backward
>>>> compat.
>>>
>>> Works for me - this is really the cleaner way of doing it...
>>
>> We cannot use -X stream with tar output format mode. So I'm afraid that
>> removing -X fetch would make people using tar output format feel disappointed.
>> Or we should make -X stream work with tar mode.
>
> Ah, yes, I've actually had that on my TODO for some time.
>
> I think the easy way of doing that is to just create an xlog.tar file.
> Since we already create "base.tar" and possibly n*<tablespace.tar>,
> adding one more file shouldn't be a big problem, and would make such
> an implementation much easier. Would be trivial to do .tar.gz for it
> as well, just like for the others.

Still, that seems like a pretty good reason not to rip the old mode
out completely.  Actually, I'd favor that anyway: changing the default
in one release and removing the deprecated feature in a later release
usually provides a smoother upgrade path.  But if there are features
that aren't even present in the newer mode yet, that's an even better
reason.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Peter Eisentraut
Date:
On 8/27/14 2:55 AM, Magnus Hagander wrote:
> I think the easy way of doing that is to just create an xlog.tar file.
> Since we already create "base.tar" and possibly n*<tablespace.tar>,
> adding one more file shouldn't be a big problem, and would make such
> an implementation much easier. Would be trivial to do .tar.gz for it
> as well, just like for the others.

That might be a way forward, but for someone who doesn't use tablespaces
and just wants and all-on-one backup, this change would make that more
cumbersome, because now you'd always have more than one file to deal with.

It might be worth considering a mode that combines all those tar files
into a super-tar.  I'm personally not a user of the tar mode, so I don't
know what a typical use would be, though.





Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Magnus Hagander
Date:
On Fri, Aug 29, 2014 at 10:34 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On 8/27/14 2:55 AM, Magnus Hagander wrote:
>> I think the easy way of doing that is to just create an xlog.tar file.
>> Since we already create "base.tar" and possibly n*<tablespace.tar>,
>> adding one more file shouldn't be a big problem, and would make such
>> an implementation much easier. Would be trivial to do .tar.gz for it
>> as well, just like for the others.
>
> That might be a way forward, but for someone who doesn't use tablespaces
> and just wants and all-on-one backup, this change would make that more
> cumbersome, because now you'd always have more than one file to deal with.

It would in stream mode, which doesn't work at all.

I do agree with Roberts suggestion that we shouldn't remove file mode
right away - but we should change the default.


> It might be worth considering a mode that combines all those tar files
> into a super-tar.  I'm personally not a user of the tar mode, so I don't
> know what a typical use would be, though.

That would probably be useful, though a lot more difficult when you
consider two separate processes writing into the same tarfile. But I
agree that the format for "single tablespace just gimme a bloody
tarfile" is quite incovenient today, in that you need a directory and
we drop a "base.tar" in there. We should perhaps try to find a more
convenient way for that specific usecase, since it probably represents
the majority of users.


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Bruce Momjian
Date:
On Fri, Aug 29, 2014 at 10:37:15PM +0200, Magnus Hagander wrote:
> On Fri, Aug 29, 2014 at 10:34 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> > On 8/27/14 2:55 AM, Magnus Hagander wrote:
> >> I think the easy way of doing that is to just create an xlog.tar file.
> >> Since we already create "base.tar" and possibly n*<tablespace.tar>,
> >> adding one more file shouldn't be a big problem, and would make such
> >> an implementation much easier. Would be trivial to do .tar.gz for it
> >> as well, just like for the others.
> >
> > That might be a way forward, but for someone who doesn't use tablespaces
> > and just wants and all-on-one backup, this change would make that more
> > cumbersome, because now you'd always have more than one file to deal with.
> 
> It would in stream mode, which doesn't work at all.
> 
> I do agree with Roberts suggestion that we shouldn't remove file mode
> right away - but we should change the default.
> 
> 
> > It might be worth considering a mode that combines all those tar files
> > into a super-tar.  I'm personally not a user of the tar mode, so I don't
> > know what a typical use would be, though.
> 
> That would probably be useful, though a lot more difficult when you
> consider two separate processes writing into the same tarfile. But I
> agree that the format for "single tablespace just gimme a bloody
> tarfile" is quite incovenient today, in that you need a directory and
> we drop a "base.tar" in there. We should perhaps try to find a more
> convenient way for that specific usecase, since it probably represents
> the majority of users.

Where are we on this?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: Switch pg_basebackup to use -X stream instead of -X fetch by default?

From
Peter Eisentraut
Date:
On 10/13/14 3:16 PM, Bruce Momjian wrote:
> Where are we on this?

I think we discovered that the pros and cons of the different options
are not as clear-cut, and it's better to leave the default as is until
additional features make one option a clear winner.