Re: Parallel heap vacuum - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Parallel heap vacuum |
Date | |
Msg-id | c1337830-01d9-48a0-81f7-4b0d79d9333e@vondra.me Whole thread Raw |
In response to | Re: Parallel heap vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
On 12/19/24 23:05, Masahiko Sawada wrote: > On Sat, Dec 14, 2024 at 1:24 PM Tomas Vondra <tomas@vondra.me> wrote: >> >> On 12/13/24 00:04, Tomas Vondra wrote: >>> ... >>> >>> The main difference is here: >>> >>> >>> master / no parallel workers: >>> >>> pages: 0 removed, 221239 remain, 221239 scanned (100.00% of total) >>> >>> 1 parallel worker: >>> >>> pages: 0 removed, 221239 remain, 10001 scanned (4.52% of total) >>> >>> >>> Clearly, with parallel vacuum we scan only a tiny fraction of the pages, >>> essentially just those with deleted tuples, which is ~1/20 of pages. >>> That's close to the 15x speedup. >>> >>> This effect is clearest without indexes, but it does affect even runs >>> with indexes - having to scan the indexes makes it much less pronounced, >>> though. However, these indexes are pretty massive (about the same size >>> as the table) - multiple times larger than the table. Chances are it'd >>> be clearer on realistic data sets. >>> >>> So the question is - is this correct? And if yes, why doesn't the >>> regular (serial) vacuum do that? >>> >>> There's some more strange things, though. For example, how come the avg >>> read rate is 0.000 MB/s? >>> >>> avg read rate: 0.000 MB/s, avg write rate: 525.533 MB/s >>> >>> It scanned 10k pages, i.e. ~80MB of data in 0.15 seconds. Surely that's >>> not 0.000 MB/s? I guess it's calculated from buffer misses, and all the >>> pages are in shared buffers (thanks to the DELETE earlier in that session). >>> >> >> OK, after looking into this a bit more I think the reason is rather >> simple - SKIP_PAGES_THRESHOLD. >> >> With serial runs, we end up scanning all pages, because even with an >> update every 5000 tuples, that's still only ~25 pages apart, well within >> the 32-page window. So we end up skipping no pages, scan and vacuum all >> everything. >> >> But parallel runs have this skipping logic disabled, or rather the logic >> that switches to sequential scans if the gap is less than 32 pages. >> >> >> IMHO this raises two questions: >> >> 1) Shouldn't parallel runs use SKIP_PAGES_THRESHOLD too, i.e. switch to >> sequential scans is the pages are close enough. Maybe there is a reason >> for this difference? Workers can reduce the difference between random >> and sequential I/0, similarly to prefetching. But that just means the >> workers should use a lower threshold, e.g. as >> >> SKIP_PAGES_THRESHOLD / nworkers >> >> or something like that? I don't see this discussed in this thread. > > Each parallel heap scan worker allocates a chunk of blocks which is > 8192 blocks at maximum, so we would need to use the > SKIP_PAGE_THRESHOLD optimization within the chunk. I agree that we > need to evaluate the differences anyway. WIll do the benchmark test > and share the results. > Right. I don't think this really matters for small tables, and for large tables the chunks should be fairly large (possibly up to 8192 blocks), in which case we could apply SKIP_PAGE_THRESHOLD just like in the serial case. There might be differences at boundaries between chunks, but that seems like a minor / expected detail. I haven't checked know if the code would need to change / how much. >> >> 2) It seems the current SKIP_PAGES_THRESHOLD is awfully high for good >> storage. If I can get an order of magnitude improvement (or more than >> that) by disabling the threshold, and just doing random I/O, maybe >> there's time to adjust it a bit. > > Yeah, you've started a thread for this so let's discuss it there. > OK. FWIW as suggested in the other thread, it doesn't seem to be merely a question of VACUUM performance, as not skipping pages gives vacuum the opportunity to do cleanup that would otherwise need to happen later. If only for this reason, I think it would be good to keep the serial and parallel vacuum consistent. regards -- Tomas Vondra
pgsql-hackers by date: