Re: Excessive PostmasterIsAlive calls slow down WAL redo - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Excessive PostmasterIsAlive calls slow down WAL redo
Date
Msg-id 29ebd68e-a6cf-3dac-8954-16b22d6b11da@iki.fi
Whole thread Raw
In response to Re: Excessive PostmasterIsAlive calls slow down WAL redo  (Andres Freund <andres@anarazel.de>)
Responses Re: Excessive PostmasterIsAlive calls slow down WAL redo  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 06/04/18 19:39, Andres Freund wrote:
> On 2018-04-06 07:39:28 -0400, Stephen Frost wrote:
>> While I tend to agree that it'd be nice to just make it cheaper, that
>> doesn't seem like something that we'd be likely to back-patch and I tend
>> to share Heikki's feelings that this is a performance regression we
>> should be considering fixing in released versions.

To be clear, this isn't a performance *regression*. It's always been bad.

I'm not sure if I'd backpatch this. Maybe after it's been in 'master' 
for a while and we've gotten some field testing of it.

> I'm doubtful about fairly characterizing this as a performance bug. It's
> not like we've O(n^2) behaviour on our hand, and if your replay isn't of
> a toy workload normally that one syscall isn't going to make a huge
> difference because you've actual IO and such going on.

If all the data fits in the buffer cache, then there would be no I/O. 
Think of a smallish database that's heavily updated. There are a lot of 
real applications like that.

> I'm also doubtful that it's sane to just check every 32 records. There's
> records that can take a good chunk of time, and just continuing for
> another 31 records seems like a bad idea.

It's pretty arbitrary, I admit. It's the best I could come with, though. 
If we could get a signal on postmaster death, that'd be best, but that's 
a much bigger patch, and I'm worried that it would bring new portability 
and reliability issues.

I'm not too worried about 32 records being too long an interval. True, 
replaying 32 CREATE DATABASE records would take a long time. But pretty 
much all other WAL records are fast enough to apply. We could make it 
every 8 records rather than 32, if that makes you feel better. Or add 
some extra conditions, like always check it when stepping to a new WAL 
segment. In any case, the fundamental difference would be though to not 
check it between every record.

- Heikki


pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: Built-in connection pooling
Next
From: Andres Freund
Date:
Subject: Re: Online enabling of checksums