Home > mailing lists

Re: Use BumpContext contexts for TupleHashTables' tablecxt - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Use BumpContext contexts for TupleHashTables' tablecxt
Date	October 27 07:24:16
Msg-id	CAApHDvrtb4bJTFXNrfK1XUkmgpeXU1FkFY9UBOt2nt39o2e0UQ@mail.gmail.com Whole thread Raw
In response to	Re: Use BumpContext contexts for TupleHashTables' tablecxt (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Mon, 27 Oct 2025 at 16:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm, I wasn't really expecting any direct time saving; the point
> was about cutting memory consumption.  So Chao Li's nearby results
> are in line with mine.

It's for the same reason that Hash Join starts to run more slowly once
the hash table is larger than L3. Because the memory access pattern
when probing the hash table can't be predicted by the CPU, larger
tables will start having to shuffle cachelines in from RAM more often.
The same happens with smaller tables when having to go from L2 out to
L3 (and even L1d out to L2). If you graphed various different table
sizes, you'd see the performance dropping off per hash lookup as the
memory usage crosses cache size boundaries.

What you've done by using bump is made it so that more tuples will fit
in the same amount of memory, therefore increasing the chances that
useful cachelines are found.

If you happened to always probe the hash table in hash key order, then
this probably wouldn't happen (or at least to a lesser extent) as the
hardware prefetcher would see the forward pattern and prefetch the
memory.

David

pgsql-hackers by date:

From: "Hayato Kuroda (Fujitsu)"
Date: 27 October, 07:05:49
Subject: RE: issue with synchronized_standby_slots

From: Shlok Kyal
Date: 27 October, 07:24:18
Subject: Re: issue with synchronized_standby_slots

Re: Use BumpContext contexts for TupleHashTables' tablecxt - Mailing list pgsql-hackers

Previous

Next