Re: pgBackRest for a 50 TB database - Mailing list pgsql-general

From Abhishek Bhola
Subject Re: pgBackRest for a 50 TB database
Date
Msg-id CAEDsCzgt13nxgCBZihBcJr2NJzfULOMqGET-uFPuz76-FeM8=g@mail.gmail.com
Whole thread Raw
In response to Re: pgBackRest for a 50 TB database  (Stephen Frost <sfrost@snowman.net>)
List pgsql-general
Hello Stephen

Just an update on this. After we deployed it on our PROD system, the results were far better than testing.
Time taken is around 4-5 hours only. And has been the case for the last 3 months or so.
full backup: 20231209-150002F
            timestamp start/stop: 2023-12-09 15:00:02+09 / 2023-12-09 19:33:56+09
            wal start/stop: 000000010001DCC30000008E / 000000010001DCC3000000A6
            database size: 32834.8GB, database backup size: 32834.8GB
            repo1: backup size: 5096.4GB

Now a question. I restored this big DB and it all works fine. However, I was wondering if there was a way to disable the subscription on Postgres while restoring the data using pgbackrest?
So for example, I have been taking a backup of this DB which has an active subscription.
When I am restoring the DB for test purposes, I don't want the subscription to be there. Is there any option to ignore the subscription?

Thanks

On Thu, Oct 5, 2023 at 10:19 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

On Thu, Oct 5, 2023 at 03:10 Abhishek Bhola <abhishek.bhola@japannext.co.jp> wrote:
Here is the update with compress-type=zst in the config file
Process-max is still 30. But it longer than before, around 27 hours 50 mins

full backup: 20231004-130621F
            timestamp start/stop: 2023-10-04 13:06:21+09 / 2023-10-05 15:56:03+09
            wal start/stop: 000000010001AC0E00000054 / 000000010001AC0E00000054
            database size: 38249.0GB, database backup size: 38249.0GB
            repo1: backup size: 5799.8GB

Do you think I could be missing something?

Sounds like there’s something else which is the bottleneck once you have process-max at 30. I suspect you could reduce that process-max value and have around the same time still with zstd.  Ultimately if you want it to be faster then you’ll need to figure out what the bottleneck is (seemingly not CPU, unlikely to be memory, so that leaves network or storage) and address that. 

We’ve seen numbers approaching 10TB/hr with lots of processes and zstd and fast storage on high end physical hardware. 

Thanks,

Stephen

This correspondence (including any attachments) is for the intended recipient(s) only. It may contain confidential or privileged information or both. No confidentiality or privilege is waived or lost by any mis-transmission. If you receive this correspondence by mistake, please contact the sender immediately, delete this correspondence (and all attachments) and destroy any hard copies. You must not use, disclose, copy, distribute or rely on any part of this correspondence (including any attachments) if you are not the intended recipient(s).本メッセージに記載および添付されている情報(以下、総称して「本情報」といいます。)は、本来の受信者による使用のみを意図しています。誤送信等により本情報を取得された場合でも、本情報に係る秘密、または法律上の秘匿特権が失われるものではありません。本電子メールを受取られた方が、本来の受信者ではない場合には、本情報及びそのコピーすべてを削除・破棄し、本電子メールが誤って届いた旨を発信者宛てにご通知下さいますようお願いします。本情報の閲覧、発信または本情報に基づくいかなる行為も明確に禁止されていることをご了承ください。

pgsql-general by date:

Previous
From: veem v
Date:
Subject: Re: Read write performance check
Next
From: Dominique Devienne
Date:
Subject: Re: psql crash with custom build on RedHat 7