Re: Streamify more code paths - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Streamify more code paths
Date
Msg-id CABPTF7XE1AE2B2Jf_jzRhzPZpU3DZ+ZGCdzx27U55HP=f0vY1w@mail.gmail.com
Whole thread
In response to Re: Streamify more code paths  (Xuneng Zhou <xunengzhou@gmail.com>)
Responses Re: Streamify more code paths
List pgsql-hackers
Hi,

On Wed, Mar 11, 2026 at 10:23 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> On Wed, Mar 11, 2026 at 8:16 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2026-03-10 19:27:59 -0400, Andres Freund wrote:
> > > > > pgstattuple_large          base= 12429.3ms  patch= 11916.8ms   1.04x
> > > > > (  4.1%)  (reads=206945->12983, io_time=6501.91->32.24ms)
> > > >
> > > > > pgstattuple_large          base= 12642.9ms  patch= 11873.5ms   1.06x
> > > > > (  6.1%)  (reads=206945->12983, io_time=6516.70->143.46ms)
> > > >
> > > > Yeah, this looks somewhat strange. The io_time has been reduced
> > > > significantly, which should also lead to a substantial reduction in
> > > > runtime.
> > >
> > > It's possible that the bottleneck just moved, e.g to the checksum computation,
> > > if you have data checksums enabled.
> > >
> > > It's also worth noting that likely each of the test reps measures
> > > something different, as likely
> > >   psql_run "$ROOT" "$PORT" -c "UPDATE heap_test SET data = data || '!' WHERE id % 5 = 0;"
> > >
> > > leads to some out-of-page updates.
> > >
> > > You're probably better off deleting some of the data in a transaction that is
> > > then rolled back. That will also unset all-visible, but won't otherwise change
> > > the layout, no matter how many test iterations you run.
> > >
> > >
> > > I'd also guess that you're seeing a relatively small win because you're
> > > updating every page. When reading every page from disk, the OS can do
> > > efficient readahead.  If there are only occasional misses, that does not work.
> >
> > I think that last one is a big part - if I use
> >   BEGIN; DELETE FROM heap_test WHERE id % 500 = 0; ROLLBACK;
> > (which leaves a lot of
> >
> > I see much bigger wins due to the pgstattuple changes.
> >
> >                        time buffered          time DIO
> > w/o read stream        2222.078 ms            2090.239 ms
> > w   read stream         299.455 ms             155.124 ms
> >
> > That's with local storage. io_uring, but numbers with worker are similar.
> >
>
> The results look great and interesting. This looks far better than
> what I observed in my earlier tests. I’ll run perf for pgstattuple
> without the switching to see what is keeping the CPU busy.
>
> --
> Best,
> Xuneng

io_uring
pgstattuple_large          base=  1090.6ms  patch=   143.3ms   7.61x
( 86.9%)  (reads=20049→20049, io_time=1040.80→46.91ms)

I observed a similar magnitude of runtime reduction after switching to
pg_buffercache_evict_relation() and using BEGIN; DELETE FROM heap_test
WHERE id % 500 = 0; ROLLBACK. However, I lost the original flame
graphs after running many performance tests. I will regenerate them
and post them later.

--
Best,
Xuneng

Attachment

pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Make Intel's ICX compiler working
Next
From: Andres Freund
Date:
Subject: Re: Make Intel's ICX compiler working