Re: checkpoints taking much longer than expected - Mailing list pgsql-general

From Tiemen Ruiten
Subject Re: checkpoints taking much longer than expected
Date
Msg-id CAEkBuzeJjzZiSyr0HOcC3Qa+ttFxbVO1YDfL-v45cBVQhcgB+g@mail.gmail.com
Whole thread Raw
In response to Re: checkpoints taking much longer than expected  (Stephen Frost <sfrost@snowman.net>)
Responses Re: checkpoints taking much longer than expected
List pgsql-general


On Sun, Jun 16, 2019 at 7:30 PM Stephen Frost <sfrost@snowman.net> wrote:
Ok, so you want fewer checkpoints because you expect to failover to a
replica rather than recover the primary on a failure.  If you're doing
synchronous replication, then that certainly makes sense.  If you
aren't, then you're deciding that you're alright with losing some number
of writes by failing over rather than recovering the primary, which can
also be acceptable but it's certainly much more questionable.
 
Yes, in our setup that's the case: a few lost transactions will have a negligible impact to the business.


I'm getting the feeling that your replicas are async, but it sounds like
you'd be better off with having at least one sync replica, so that you
can flip to it quickly. 

They are indeed async, we traded durability for performance here, because we can accept some lost transactions.
 
Alternatively, having a way to more easily make
the primary to accepting new writes, flush everything to the replicas,
report that it's completed doing so, to allow you to promote a replica
without losing anything, and *then* go through the process on the
primary of doing a checkpoint, would be kind of nice.

I suppose that would require being able to demote a master to a slave during runtime.
That would definitely be nice-to-have.
 


Thanks,

Stephen

pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: checkpoints taking much longer than expected
Next
From: Stephen Frost
Date:
Subject: Re: checkpoints taking much longer than expected