Re: finding changed blocks using WAL scanning - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: finding changed blocks using WAL scanning |
Date | |
Msg-id | 20190423180114.vslfhhbtyan5aqba@development Whole thread Raw |
In response to | Re: finding changed blocks using WAL scanning (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Tue, Apr 23, 2019 at 10:09:39AM -0700, Andres Freund wrote: >Hi, > >On 2019-04-23 19:01:29 +0200, Tomas Vondra wrote: >> On Tue, Apr 23, 2019 at 09:34:54AM -0700, Andres Freund wrote: >> > Hi, >> > >> > On 2019-04-23 18:07:40 +0200, Tomas Vondra wrote: >> > > Well, the thing is that for prefetching to be possible you actually have >> > > to be a bit behind. Otherwise you can't really look forward which blocks >> > > will be needed, right? >> > > >> > > IMHO the main use case for prefetching is when there's a spike of activity >> > > on the primary, making the standby to fall behind, and then hours takes >> > > hours to catch up. I don't think the cases with just a couple of MBs of >> > > lag are the issue prefetching is meant to improve (if it does, great). >> > >> > I'd be surprised if a good implementation didn't. Even just some smarter >> > IO scheduling in the startup process could help a good bit. E.g. no need >> > to sequentially read the first and then the second block for an update >> > record, if you can issue both at the same time - just about every >> > storage system these days can do a number of IO requests in parallel, >> > and it nearly halves latency effects. And reading a few records (as in a >> > few hundred bytes commonly) ahead, allows to do much more than that. >> > >> >> I don't disagree with that - prefetching certainly can improve utilization >> of the storage system. The question is whether it can meaningfully improve >> performance of the recovery process in cases when it does not lag. And I >> think it can't (perhaps with remote_apply being an exception). > >Well. I think a few dozen records behind doesn't really count as "lag", >and I think that's where it'd start to help (and for some record types >like updates it'd start to help even for single records). It'd convert >scenarios where we'd currently fall behind slowly into scenarios where >we can keep up - but where there's no meaningful lag while we keep up. >What's your argument for me being wrong? > I was not saying you are wrong. I think we actually agree on the main points. My point is that prefetching is most valuable for cases when the standby can't keep up and falls behind significantly - at which point we have sufficient queue of blocks to prefetch. I don't care about the case when the standby can keep up even without prefetching, because the metric we need to optimize (i.e. lag) is close to 0 even without prefetching. >And even if we'd keep up without any prefetching, issuing requests in a >more efficient manner allows for more efficient concurrent use of the >storage system. It'll often effectively reduce the amount of random >iops. Maybe, although the metric we (and users) care about the most is the amount of lag. If the system keeps up even without prefetching, no one will complain about I/O utilization. When the lag is close to 0, the average throughput/IOPS/... is bound to be the same in both cases, because it does not affect how fast the standby receives WAL from the primary. Except that it's somewhat "spikier" with prefetching, because we issue requests in bursts. Which may actually be a bad thing. Of course, maybe prefetching will make it much more efficient even in the "no lag" case, and while it won't improve the recovery, it'll leave more I/O bandwidth for the other processes (say, queries on hot standby). So to be clear, I'm not against prefetching even in this case, but it's not the primary reason why I think we need to do that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: