Re: block-level incremental backup - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: block-level incremental backup
Date
Msg-id C3B78817-C247-44DB-AC56-ACDEF5F800BD@yandex-team.ru
Whole thread Raw
In response to Re: block-level incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: block-level incremental backup
List pgsql-hackers
Hi!

Sorry for the delay.

> 18 апр. 2019 г., в 21:56, Robert Haas <robertmhaas@gmail.com> написал(а):
>
> On Wed, Apr 17, 2019 at 5:20 PM Stephen Frost <sfrost@snowman.net> wrote:
>> As I understand it, the problem is not with backing up an individual
>> database or cluster, but rather dealing with backing up thousands of
>> individual clusters with thousands of tables in each, leading to an
>> awful lot of tables with lots of FSMs/VMs, all of which end up having to
>> get copied and stored wholesale.  I'll point this thread out to him and
>> hopefully he'll have a chance to share more specific information.
>
> Sounds good.

During introduction of WAL-delta backups, we faced two things:
1. Heavy spike in network load. We shift beginning of backup randomly, but variation is not very big: night is short
andwe want to make big backups during low rps time. This low variation of time of starts of small backups creates big
networkspike. 
2. Incremental backups became very cheap if measured in used resources of a single cluster.

1st is not a big problem, actually, bit we realized that we can do incremental backups not just at night, but, for
example,4 times a day. Or every hour. Or every minute. Why not, if they are cheap enough? 

Incremental backup of 1Tb DB made with distance of few minutes (small change set) is few Gbs. All of this size is made
ofFSM (no LSN) and VM (hard to use LSN). 
Sure, this overhead size is fine if we make daily backup. But at some frequency of backups it will be too much.

I think that problem of incrementing FSM and VM is too distant now.
But if I had to implement it right now I'd choose following way: do not backup FSM and VM, recreate it during restore.
Lookslike it is possible, but too much AM-specific. 
It is hard when you write backup tool in Go and cannot simply link with PG.

> 15 апр. 2019 г., в 18:01, Stephen Frost <sfrost@snowman.net> написал(а):
> ...the goal here
> isn't actually to make pg_basebackup into an enterprise backup tool,
> ...

BTW, I'm all hands for extensibility and "hackability". But, personally, I'd be happy if pg_basebackup would be
ubiquitousand sufficient. And tools like WAL-G and others became part of a history. There is not fundamental reason why
externalbackup tool can be better than backup tool in core. (Unlike many PLs, data types, hooks, tuners etc) 


Here's 53 mentions of "parallel backup". I want to note that there may be parallel read from disk and parallel network
transmission.Things between these two are neglectable and can be single-threaded. From my POV, it's not about threads,
it'sabout saturated IO controllers. 
Also I think parallel restore matters more than parallel backup. Backups themself can be slow, on many clusters we even
throttledisk IO. But users may want parallel backup to catch-up standby. 

Thanks.

Best regards, Andrey Borodin.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: TM format can mix encodings in to_char()
Next
From: Alexander Korotkov
Date:
Subject: Re: jsonpath