Re: Parallel heap vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel heap vacuum
Date
Msg-id CAA4eK1J8x9Jzk1RkLQkD+_iKHsADBkvGZaD34HDdmEPdKZsQ6A@mail.gmail.com
Whole thread Raw
In response to Re: Parallel heap vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Parallel heap vacuum
List pgsql-hackers
On Wed, Mar 5, 2025 at 6:25 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Mar 3, 2025 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > Another performance regression I can see in the results is that heap
> > vacuum phase (phase III) got slower with the patch. It's weired to me
> > since I don't touch the code of heap vacuum phase. I'm still
> > investigating the cause.
>
> I have investigated this regression. I've confirmed that In both
> scenarios (patched and unpatched), the entire table and its associated
> indexes were loaded into the shared buffer before the vacuum. Then,
> the 'perf record' analysis, focused specifically on the heap vacuum
> phase of the patched code, revealed numerous soft page faults
> occurring:
>
>     62.37%    13.90%  postgres  postgres            [.] lazy_vacuum_heap_rel
>             |
>             |--52.44%--lazy_vacuum_heap_rel
>             |          |
>             |          |--46.33%--lazy_vacuum_heap_page (inlined)
>             |          |          |
>             |          |          |--32.42%--heap_page_is_all_visible (inlined)
>             |          |          |          |
>             |          |          |          |--26.46%--HeapTupleSatisfiesVacuum
>             |          |          |          |
> HeapTupleSatisfiesVacuumHorizon
>             |          |          |          |
> HeapTupleHeaderXminCommitted (inlined)
>             |          |          |          |          |
>             |          |          |          |           --18.52%--page_fault
>             |          |          |          |                     do_page_fault
>             |          |          |          |
> __do_page_fault
>             |          |          |          |
> handle_mm_fault
>             |          |          |          |
> __handle_mm_fault
>             |          |          |          |
> handle_pte_fault
>             |          |          |          |                     |
>             |          |          |          |
> |--16.53%--filemap_map_pages
>             |          |          |          |                     |          |
>             |          |          |          |                     |
>         --2.63%--alloc_set_pte
>             |          |          |          |                     |
>                   pfn_pte
>             |          |          |          |                     |
>             |          |          |          |
> --1.99%--pmd_page_vaddr
>             |          |          |          |
>             |          |          |           --1.99%--TransactionIdPrecedes
>
> I did not observe these page faults in the 'perf record' results for
> the HEAD version. Furthermore, when I disabled parallel heap vacuum
> while keeping parallel index vacuuming enabled, the regression
> disappeared. Based on these findings, the likely cause of the
> regression appears to be that during parallel heap vacuum operations,
> table blocks were loaded into the shared buffer by parallel vacuum
> workers.
>

In the previous paragraph, you mentioned that the entire table and its
associated indexes were loaded into the shared buffer before the
vacuum. If that is true, then why does the parallel vacuum need to
reload the table blocks into shared buffers?

> However, in the heap vacuum phase, the leader process needed
> to process all blocks, resulting in soft page faults while creating
> Page Table Entries (PTEs). Without the patch, the backend process had
> already created PTEs during the heap scan, thus preventing these
> faults from occurring during the heap vacuum phase.
>

This part is again not clear to me because I am assuming all the data
exists in shared buffers before the vacuum, so why the page faults
will occur in the first place.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Printing window function OVER clauses in EXPLAIN
Next
From: Shlok Kyal
Date:
Subject: Re: Restrict copying of invalidated replication slots