Re: Parallel heap vacuum - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Parallel heap vacuum |
Date | |
Msg-id | CAD21AoCdxc6jLfk5fc1a5-2DgxFikrjFPa6-A5b8pn27i4yKRg@mail.gmail.com Whole thread Raw |
In response to | Re: Parallel heap vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
On Mon, Mar 31, 2025 at 2:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Mar 27, 2025 at 11:40 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Mar 27, 2025 at 8:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Wed, Mar 26, 2025 at 1:00 PM Melanie Plageman > > > <melanieplageman@gmail.com> wrote: > > > > > > > > On Mon, Mar 24, 2025 at 7:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > You're right. I've studied the read stream code and figured out how to > > > > > use it. In the attached patch, we end the read stream at the end of > > > > > phase 1 and start a new read stream, as you suggested. > > > > > > > > I've started looking at this patch set some more. > > > > > > Thank you for reviewing the patch! > > > > > > > > > > > In heap_vac_scan_next_block() if we are in the SKIP_PAGES_THRESHOLD > > > > codepath and run out of unskippable blocks in the current chunk and > > > > then go back to get another chunk (goto retry) but we are near the > > > > memory limit so we can't get another block > > > > (!dead_items_check_memory_limit()), could we get an infinite loop? > > > > > > > > Or even incorrectly encroach on another worker's block? Asking that > > > > because of this math > > > > end_block = next_block + > > > > > > > > vacrel->plvstate->scanworker->pbscanwork.phsw_chunk_remaining + 1; > > > > > > You're right. We should make sure that reset next_block is reset to > > > InvalidBlockNumber at the beginning of the retry loop. > > > > > > > > > > > if vacrel->plvstate->scanworker->pbscanwork.phsw_chunk_remaining is 0 > > > > and we are in the goto retry loop, it seems like we could keep > > > > incrementing next_block even when we shouldn't be. > > > > > > Right. Will fix. > > > > > > > > > > > I just want to make sure that the skip pages optimization works with > > > > the parallel block assignment and the low memory read stream > > > > wind-down. > > > > > > > > I also think you do not need to expose > > > > table_block_parallelscan_skip_pages_in_chunk() in the table AM. It is > > > > only called in heap-specific code and the logic seems very > > > > heap-related. If table AMs want something to skip some blocks, they > > > > could easily implement it. > > > > > > Agreed. Will remove it. > > > > > > > > > > > On another topic, I think it would be good to have a comment above > > > > this code in parallel_lazy_scan_gather_scan_results(), stating why we > > > > are very sure it is correct. > > > > Assert(TransactionIdIsValid(data->NewRelfrozenXid)); > > > > Assert(MultiXactIdIsValid(data->NewRelminMxid)); > > > > > > > > if (TransactionIdPrecedes(data->NewRelfrozenXid, > > > > vacrel->scan_data->NewRelfrozenXid)) > > > > vacrel->scan_data->NewRelfrozenXid = data->NewRelfrozenXid; > > > > > > > > if (MultiXactIdPrecedesOrEquals(data->NewRelminMxid, > > > > vacrel->scan_data->NewRelminMxid)) > > > > vacrel->scan_data->NewRelminMxid = data->NewRelminMxid; > > > > > > > > if (data->nonempty_pages < vacrel->scan_data->nonempty_pages) > > > > vacrel->scan_data->nonempty_pages = data->nonempty_pages; > > > > > > > > vacrel->scan_data->skippedallvis |= data->skippedallvis; > > > > > > > > Parallel relfrozenxid advancement sounds scary, and scary things are > > > > best with comments. Even though the way this works is intuitive, I > > > > think it is worth pointing out that this part is important to get > > > > right so future programmers know how important it is. > > > > > > > > One thing I was wondering about is if there are any implications of > > > > different workers having different values in their GlobalVisState. > > > > GlobalVisState can be updated during vacuum, so even if they start out > > > > with the same values, that could diverge. It is probably okay since it > > > > just controls what tuples are removable. Some workers may remove fewer > > > > tuples than they absolutely could, and this is probably okay. > > > > > > > > > > Good point. > > > > > > > And if it is okay for each worker to have different GlobalVisState > > > > then maybe you shouldn't have a GlobalVisState in shared memory. If > > > > you look at GlobalVisTestFor() it just returns the memory address of > > > > that global variable in the backend. So, it seems like it might be > > > > better to just let each parallel worker use their own backend local > > > > GlobalVisState and not try to put it in shared memory and copy it from > > > > one worker to the other workers when initializing them. I'm not sure. > > > > At the very least, there should be a comment explaining why you've > > > > done it the way you have done it. > > > > > > Agreed. IIUC it's not a problem even if parallel workers use their own > > > GlobalVisState. I'll make that change and remove the 0004 patch which > > > exposes GlobalVisState. > > > > > > I'll send the updated patch soon. > > > > I've attached the updated patches. This version includes the comments > > from Melanie, some bug fixes, and comment updates. > > Rebased the patches as they conflicted with recent commits. > I've attached the new version patch. There are no major changes; I fixed some typos, improved the comment, and removed duplicated codes. Also, I've updated the commit messages. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: