Thread: WAL restore is very slow

WAL restore is very slow

From
Madhu Sudan
Date:
Hi

We have PG-14 with a huge data set of 14 TB running on r5b.2xlarge. We have set up WAL archiving and restoring them onto a replica server. The WAL restore on the replica is very slow and we are not able to achieve the 4 hour delayed replica. It is always behind 30 hrs with the huge WAL generation.

I have checked the following and they look fine
1. Bottlenecks on the replica server
2. Memory consumption and swap
3. EFS IO throughput
4.checkpoint_completion_target = 0.9
5. wal_buffers = 16MB
6. wal_log_hints = on
7. Verified logs and didn't find anything useful related to the issue

Can you please suggest how to improve the WAL restore performance 

Thank you 
Madhu Sudan



Re: WAL restore is very slow

From
John Scalia
Date:
First off, why are you still running on a rather small r5 instance? AWS has had r6 instances available for some time,
andthey’re faster and more efficient. Of my some 30 instances which I take care of, I haven’t had any r5’s in quite
sometime. 

And it sounds here that you’re running on an EC2 instance as well. Is there some reason you haven’t gone to RDS?
Clusterconfiguration in RDS uses an internal RDS specific replication method that does not involve WAL files. 
—
John

Sent from my iPad

> On Aug 29, 2022, at 6:10 AM, Madhu Sudan <madhusudan0429@gmail.com> wrote:
>
> 
> Hi
>
> We have PG-14 with a huge data set of 14 TB running on r5b.2xlarge. We have set up WAL archiving and restoring them
ontoa replica server. The WAL restore on the replica is very slow and we are not able to achieve the 4 hour delayed
replica.It is always behind 30 hrs with the huge WAL generation. 
>
> I have checked the following and they look fine
> 1. Bottlenecks on the replica server
> 2. Memory consumption and swap
> 3. EFS IO throughput
> 4.checkpoint_completion_target = 0.9
> 5. wal_buffers = 16MB
> 6. wal_log_hints = on
> 7. Verified logs and didn't find anything useful related to the issue
>
> Can you please suggest how to improve the WAL restore performance
>
> Thank you
> Madhu Sudan
>
>
>



Re: WAL restore is very slow

From
Jeff Janes
Date:
On Mon, Aug 29, 2022 at 6:10 AM Madhu Sudan <madhusudan0429@gmail.com> wrote:
Hi

We have PG-14 with a huge data set of 14 TB running on r5b.2xlarge. We have set up WAL archiving and restoring them onto a replica server. The WAL restore on the replica is very slow and we are not able to achieve the 4 hour delayed replica. It is always behind 30 hrs with the huge WAL generation.

I have checked the following and they look fine
1. Bottlenecks on the replica server

How did you check for bottlenecks, and what did you see to conclude it looked fine?  Clearly there is a bottleneck somewhere.

Cheers,

Jeff