Re: WIP: WAL prefetch (another approach) - Mailing list pgsql-hackers
From | Stephen Frost |
---|---|
Subject | Re: WIP: WAL prefetch (another approach) |
Date | |
Msg-id | 20210317214331.GF20766@tamriel.snowman.net Whole thread Raw |
In response to | Re: WIP: WAL prefetch (another approach) (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: WIP: WAL prefetch (another approach)
|
List | pgsql-hackers |
Greetings, * Tomas Vondra (tomas.vondra@enterprisedb.com) wrote: > Right, I was just going to point out the FPIs are not necessary - what > matters is the presence of long streaks of WAL records touching the same > set of blocks. But people with workloads where this is common likely > don't need the WAL prefetching at all - the replica can keep up just > fine, because it doesn't need to do much I/O anyway (and if it can't > then prefetching won't help much anyway). So just don't enable the > prefetching, and there'll be no overhead. Isn't this exactly the common case though..? Checkpoints happening every 5 minutes, the replay of the FPI happens first and then the record is updated and everything's in SB for the later changes? You mentioned elsewhere that this would improve 80% of cases but that doesn't seem to be backed up by anything and certainly doesn't seem likely to be the case if we're talking about across all PG deployments. I also disagree that asking the kernel to go do random I/O for us, even as a prefetch, is entirely free simply because we won't actually need those pages. At the least, it potentially pushes out pages that we might need shortly from the filesystem cache, no? > If it was up to me, I'd just get the patch committed as is. Delaying the > feature because of concerns that it might have some negative effect in > some cases, when that can be simply mitigated by disabling the feature, > is not really beneficial for our users. I don't know that we actually know how many cases it might have a negative effect on or what the actual amount of such negative case there might be- that's really why we should probably try to actually benchmark it and get real numbers behind it, particularly when the chances of running into such a negative effect with the default configuration (that is, FPWs enabled) on the more typical platforms (as in, not ZFS) is more likely to occur in the field than the cases where FPWs are disabled and someone's running on ZFS. Perhaps more to the point, it'd be nice to see how this change actually improves the caes where PG is running with more-or-less the defaults on the more commonly deployed filesystems. If it doesn't then maybe it shouldn't be the default..? Surely the folks running on ZFS and running with FPWs disabled would be able to manage to enable it if they wished to and we could avoid entirely the question of if this has a negative impact on the more common cases. Guess I'm just not a fan of pushing out a change that will impact everyone by default, in a possibly negative way (or positive, though that doesn't seem terribly likely, but who knows), without actually measuring what that impact will look like in those more common cases. Showing that it's a great win when you're on ZFS or running with FPWs disabled is good and the expected best case, but we should be considering the worst case too when it comes to performance improvements. Anyhow, ultimately I don't know that there's much more to discuss on this thread with regard to this particular topic, at least. As I said before, if everyone else is on board and not worried about it then so be it; I feel that at least the concern that I raised has been heard. Thanks, Stephen
Attachment
pgsql-hackers by date: