Re: Parallel heap vacuum - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel heap vacuum
Date
Msg-id CAA4eK1JfxZ90118Tm0a6QUPDBbAgyHAW1QqhhGEzhU+csK7QhA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel heap vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Parallel heap vacuum
List pgsql-hackers
On Mon, Mar 10, 2025 at 11:57 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sun, Mar 9, 2025 at 11:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > > However, in the heap vacuum phase, the leader process needed
> > > to process all blocks, resulting in soft page faults while creating
> > > Page Table Entries (PTEs). Without the patch, the backend process had
> > > already created PTEs during the heap scan, thus preventing these
> > > faults from occurring during the heap vacuum phase.
> > >
> >
> > This part is again not clear to me because I am assuming all the data
> > exists in shared buffers before the vacuum, so why the page faults
> > will occur in the first place.
>
> IIUC PTEs are process-local data. So even if physical pages are loaded
> to PostgreSQL's shared buffer (and paga caches), soft page faults (or
> minor page faults)[1] can occur if these pages are not yet mapped in
> its page table.
>

Okay, I got your point. BTW, I noticed that even for the case where
all the data is in shared_buffers, the performance improvement for
workers greater than two does decrease marginally. Am I reading the
data correctly? If so, what is the theory, and do we have
recommendations for a parallel degree?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: speedup COPY TO for partitioned table.
Next
From: Anthonin Bonnefoy
Date:
Subject: Re: Memory context can be its own parent and child in replication command