Thread: WAL "low watermark" during base backup

WAL "low watermark" during base backup

From
Magnus Hagander
Date:
Attached patch implements a "low watermark wal location" in the
walsender shmem array. Setting this value in a walsender prevents
transaction log removal prior to this point - similar to how
wal_keep_segments work, except with an absolute number rather than
relative. For now, this is set when running a base backup with WAL
included - to prevent the required WAL to be recycled away while the
backup is running, without having to guestimate the value for
wal_keep_segments. (There could be other ways added to set it in the
future, but that's the only one I've done for now)

It obviously needs some documentation updates as well, but I wanted to
get some comments on the way it's done before I work on those.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachment

Re: WAL "low watermark" during base backup

From
Jaime Casanova
Date:
On Fri, Sep 2, 2011 at 12:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Attached patch implements a "low watermark wal location" in the
> walsender shmem array. Setting this value in a walsender prevents
> transaction log removal prior to this point - similar to how
> wal_keep_segments work, except with an absolute number rather than
> relative.

cool! just a question, shouldn't we clean the value after the base
backup has finished?

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


Re: WAL "low watermark" during base backup

From
Magnus Hagander
Date:
On Fri, Sep 2, 2011 at 20:12, Jaime Casanova <jaime@2ndquadrant.com> wrote:
> On Fri, Sep 2, 2011 at 12:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> Attached patch implements a "low watermark wal location" in the
>> walsender shmem array. Setting this value in a walsender prevents
>> transaction log removal prior to this point - similar to how
>> wal_keep_segments work, except with an absolute number rather than
>> relative.
>
> cool! just a question, shouldn't we clean the value after the base
> backup has finished?

We should. Thanks, will fix!

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: WAL "low watermark" during base backup

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Attached patch implements a "low watermark wal location" in the
> walsender shmem array. Setting this value in a walsender prevents
> transaction log removal prior to this point - similar to how
> wal_keep_segments work, except with an absolute number rather than
> relative. For now, this is set when running a base backup with WAL
> included - to prevent the required WAL to be recycled away while the
> backup is running, without having to guestimate the value for
> wal_keep_segments. (There could be other ways added to set it in the
> future, but that's the only one I've done for now)

I agree with that parenthetical remark, ie that we'll probably consider
other uses for this in future, so I'd suggest changing this one comment:

> +  * Also check if there any in-progress base backup that has set
> +  * a low watermark preventing us from removing it.

Just say "if any WAL sender has a low watermark that prevents us from
removing it".

Looks reasonably sane otherwise, modulo Jaime's comment about the
missing reset step.
        regards, tom lane


Re: WAL "low watermark" during base backup

From
Dimitri Fontaine
Date:
Magnus Hagander <magnus@hagander.net> writes:

> Attached patch implements a "low watermark wal location" in the
> walsender shmem array. Setting this value in a walsender prevents
> transaction log removal prior to this point - similar to how
> wal_keep_segments work, except with an absolute number rather than

Cool.  The first use case that comes to my mind is when to clean old WAL
files when using multiple standby servers.  Will it help here?

> relative. For now, this is set when running a base backup with WAL
> included - to prevent the required WAL to be recycled away while the
> backup is running, without having to guestimate the value for
> wal_keep_segments.

I would have guessed that if you stream WALs in parallel of the backup,
and begin streaming before you pg_start_backup(), you don't need
anything more.  Is that wrong?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: WAL "low watermark" during base backup

From
Simon Riggs
Date:
On Fri, Sep 2, 2011 at 6:52 PM, Magnus Hagander <magnus@hagander.net> wrote:

> Attached patch implements a "low watermark wal location" in the
> walsender shmem array. Setting this value in a walsender prevents
> transaction log removal prior to this point - similar to how
> wal_keep_segments work, except with an absolute number rather than
> relative. For now, this is set when running a base backup with WAL
> included - to prevent the required WAL to be recycled away while the
> backup is running, without having to guestimate the value for
> wal_keep_segments. (There could be other ways added to set it in the
> future, but that's the only one I've done for now)
>
> It obviously needs some documentation updates as well, but I wanted to
> get some comments on the way it's done before I work on those.

I'm not yet fully available for a discussion on this, but not sure I like this.

You don't have to guess the setting of wal_keep_segments, you
calculate it exactly from the size of your WAL disk. No other
calculation is easy or accurate.

This patch implements "fill disk until primary croaks" behaviour which
means you are making a wild and risky guess as to whether it will
work. If it does not, you are hosed.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: WAL "low watermark" during base backup

From
Magnus Hagander
Date:
On Sun, Sep 4, 2011 at 19:02, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, Sep 2, 2011 at 6:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
>
>> Attached patch implements a "low watermark wal location" in the
>> walsender shmem array. Setting this value in a walsender prevents
>> transaction log removal prior to this point - similar to how
>> wal_keep_segments work, except with an absolute number rather than
>> relative. For now, this is set when running a base backup with WAL
>> included - to prevent the required WAL to be recycled away while the
>> backup is running, without having to guestimate the value for
>> wal_keep_segments. (There could be other ways added to set it in the
>> future, but that's the only one I've done for now)
>>
>> It obviously needs some documentation updates as well, but I wanted to
>> get some comments on the way it's done before I work on those.
>
> I'm not yet fully available for a discussion on this, but not sure I like this.
>
> You don't have to guess the setting of wal_keep_segments, you
> calculate it exactly from the size of your WAL disk. No other
> calculation is easy or accurate.

Uh, no. What about the (very large number of) cases where pg is just
sitting on one partition, possibly shared with a whole lot of other
services? You'd need to set it to all-of-your-disk, which is something
that will change over time.

Maybe I wasn't entirely clear in the submission, but if it wasn't
obvious: the use-case for this is the small and simple installations
that need a simple way of doing a reliable online backup. This is the
"pg_basebackup -x" usecase altogether - for example, anybody "bigger"
likely has archiv elogging setup already, in which case this
functionality is not interesting at all.

> This patch implements "fill disk until primary croaks" behaviour which
> means you are making a wild and risky guess as to whether it will
> work. If it does not, you are hosed.

Replace "primary" with "server" - remember that this is about backups
and not replication primarily.

That said, you are correct, it does implement that. But then again,
logging into the database and opening a transaction and just leaving
it around for $forever will have similar problems - yet, we allow
users to do that.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: WAL "low watermark" during base backup

From
Simon Riggs
Date:
On Mon, Sep 5, 2011 at 11:38 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Sun, Sep 4, 2011 at 19:02, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Fri, Sep 2, 2011 at 6:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>
>>> Attached patch implements a "low watermark wal location" in the
>>> walsender shmem array. Setting this value in a walsender prevents
>>> transaction log removal prior to this point - similar to how
>>> wal_keep_segments work, except with an absolute number rather than
>>> relative. For now, this is set when running a base backup with WAL
>>> included - to prevent the required WAL to be recycled away while the
>>> backup is running, without having to guestimate the value for
>>> wal_keep_segments. (There could be other ways added to set it in the
>>> future, but that's the only one I've done for now)
>>>
>>> It obviously needs some documentation updates as well, but I wanted to
>>> get some comments on the way it's done before I work on those.
>>
>> I'm not yet fully available for a discussion on this, but not sure I like this.
>>
>> You don't have to guess the setting of wal_keep_segments, you
>> calculate it exactly from the size of your WAL disk. No other
>> calculation is easy or accurate.
>
> Uh, no. What about the (very large number of) cases where pg is just
> sitting on one partition, possibly shared with a whole lot of other
> services? You'd need to set it to all-of-your-disk, which is something
> that will change over time.
>
> Maybe I wasn't entirely clear in the submission, but if it wasn't
> obvious: the use-case for this is the small and simple installations
> that need a simple way of doing a reliable online backup. This is the
> "pg_basebackup -x" usecase altogether - for example, anybody "bigger"
> likely has archiv elogging setup already, in which case this
> functionality is not interesting at all.

I understand the need for a reliable backup, problem is they won't get
one like this.

If your disk fills, the backup cannot end correctly, so you must
somehow avoid the disk filling while the backup is taken.

Removing the safety that prevents the disk from filling doesn't
actually prevent it filling.

If you must have this then make pg_basebackup copy xlog files
regularly during the backup. That way your backup can take forever and
your primary disk won't fill up. In many cases it actually will take
forever, but at least we don't take down the primary.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: WAL "low watermark" during base backup

From
Magnus Hagander
Date:
On Tue, Sep 6, 2011 at 22:35, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, Sep 5, 2011 at 11:38 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> On Sun, Sep 4, 2011 at 19:02, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> On Fri, Sep 2, 2011 at 6:52 PM, Magnus Hagander <magnus@hagander.net> wrote:
>>>
>>>> Attached patch implements a "low watermark wal location" in the
>>>> walsender shmem array. Setting this value in a walsender prevents
>>>> transaction log removal prior to this point - similar to how
>>>> wal_keep_segments work, except with an absolute number rather than
>>>> relative. For now, this is set when running a base backup with WAL
>>>> included - to prevent the required WAL to be recycled away while the
>>>> backup is running, without having to guestimate the value for
>>>> wal_keep_segments. (There could be other ways added to set it in the
>>>> future, but that's the only one I've done for now)
>>>>
>>>> It obviously needs some documentation updates as well, but I wanted to
>>>> get some comments on the way it's done before I work on those.
>>>
>>> I'm not yet fully available for a discussion on this, but not sure I like this.
>>>
>>> You don't have to guess the setting of wal_keep_segments, you
>>> calculate it exactly from the size of your WAL disk. No other
>>> calculation is easy or accurate.
>>
>> Uh, no. What about the (very large number of) cases where pg is just
>> sitting on one partition, possibly shared with a whole lot of other
>> services? You'd need to set it to all-of-your-disk, which is something
>> that will change over time.
>>
>> Maybe I wasn't entirely clear in the submission, but if it wasn't
>> obvious: the use-case for this is the small and simple installations
>> that need a simple way of doing a reliable online backup. This is the
>> "pg_basebackup -x" usecase altogether - for example, anybody "bigger"
>> likely has archiv elogging setup already, in which case this
>> functionality is not interesting at all.
>
> I understand the need for a reliable backup, problem is they won't get
> one like this.
>
> If your disk fills, the backup cannot end correctly, so you must
> somehow avoid the disk filling while the backup is taken.

The same thing will happen if your archive_command stops working - the
disk fills up. There are plenty of scenarios whereby the disk can fill
up.

There are a lot of cases where this really isn't a risk, and I believe
this would be a helpful feature in many of those *simple* cases.


> Removing the safety that prevents the disk from filling doesn't
> actually prevent it filling.
>
> If you must have this then make pg_basebackup copy xlog files
> regularly during the backup. That way your backup can take forever and
> your primary disk won't fill up. In many cases it actually will take
> forever, but at least we don't take down the primary.

There is a patch to do something like that as well sitting on the CF
page. I don't believe one necessarily excludes the other.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: WAL "low watermark" during base backup

From
Dimitri Fontaine
Date:
Magnus Hagander <magnus@hagander.net> writes:
>> If you must have this then make pg_basebackup copy xlog files
>> regularly during the backup. That way your backup can take forever and
>> your primary disk won't fill up. In many cases it actually will take
>> forever, but at least we don't take down the primary.
>
> There is a patch to do something like that as well sitting on the CF
> page. I don't believe one necessarily excludes the other.

I'm not getting why we need the later one when we have this older one?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: WAL "low watermark" during base backup

From
Magnus Hagander
Date:
On Fri, Sep 9, 2011 at 13:40, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>>> If you must have this then make pg_basebackup copy xlog files
>>> regularly during the backup. That way your backup can take forever and
>>> your primary disk won't fill up. In many cases it actually will take
>>> forever, but at least we don't take down the primary.
>>
>> There is a patch to do something like that as well sitting on the CF
>> page. I don't believe one necessarily excludes the other.
>
> I'm not getting why we need the later one when we have this older one?

One of them is for the simple case. It requires a single connection to
the server, and it supports things like writing to tarfiles and
compression.

The other one is more compelx. It uses multiple connections (one for
the base, one for the xlog), and as such doesn't support writing to
files, only directories.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: WAL "low watermark" during base backup

From
Florian Pflug
Date:
On Sep9, 2011, at 13:48 , Magnus Hagander wrote:
> On Fri, Sep 9, 2011 at 13:40, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>>> If you must have this then make pg_basebackup copy xlog files
>>>> regularly during the backup. That way your backup can take forever and
>>>> your primary disk won't fill up. In many cases it actually will take
>>>> forever, but at least we don't take down the primary.
>>>
>>> There is a patch to do something like that as well sitting on the CF
>>> page. I don't believe one necessarily excludes the other.
>>
>> I'm not getting why we need the later one when we have this older one?
>
> One of them is for the simple case. It requires a single connection to
> the server, and it supports things like writing to tarfiles and
> compression.
>
> The other one is more compelx. It uses multiple connections (one for
> the base, one for the xlog), and as such doesn't support writing to
> files, only directories.

I guess the real question is, why can't we stream the WALs as they are
generated instead of at the end even over a single connection and when
writing tarfiles?

Couldn't we send all available WAL after each single data-file instead
of waiting for all data files to be transferred before sending WAL?

best regards,
Florian Pflug



Re: WAL "low watermark" during base backup

From
Dimitri Fontaine
Date:
Florian Pflug <fgp@phlo.org> writes:
> Couldn't we send all available WAL after each single data-file instead
> of waiting for all data files to be transferred before sending WAL?

+1 (or maybe not at the file boundary but rather driven by archive
command with some internal hooking, as the backend needs some new
provisions here anyway).

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: WAL "low watermark" during base backup

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Fri, Sep 9, 2011 at 13:40, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
>> I'm not getting why we need the later one when we have this older one?

> One of them is for the simple case. It requires a single connection to
> the server, and it supports things like writing to tarfiles and
> compression.

> The other one is more compelx. It uses multiple connections (one for
> the base, one for the xlog), and as such doesn't support writing to
> files, only directories.

I'm with Dimitri on this one: let's not invent two different ways to do
the same thing.  Let's pick the better one, or meld them somehow, so
we only have one implementation to support going forward.
        regards, tom lane