Re: Fix overflow of nbatch - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Fix overflow of nbatch
Date
Msg-id CAAKRu_Zf8i-nKct0emXE0wqcUnjRtpB_dm3_2AeMBzaBvHBe0Q@mail.gmail.com
Whole thread Raw
In response to Re: Fix overflow of nbatch  (Tomas Vondra <tomas@vondra.me>)
Responses Re: Fix overflow of nbatch
List pgsql-hackers
On Wed, Sep 24, 2025 at 7:02 PM Tomas Vondra <tomas@vondra.me> wrote:
>
> Here's a couple draft patches fixing the bug:
>
> - 0001 adds the missing size_t cast, to fix the overflow
> - 0002 fixes the balancing, by adjusting the hash table size limit
> - 0003 adds the missing overflow protection for nbuckets and the hash
> table limit

I've attached an alternative patch idea. It's shorter because I
avoided multiplication using algebraic equivalence. There are two
overflow checks for nbuckets and that max_pointers will be able to be
allocated, but otherwise, it avoids most of the overflow risk by
switching to division.

On the topic of max_pointers, I think there is no need to handle >
MaxAllocSize. We were willing to cope with the original values of
nbatch and nbuckets, we are just trying to optimize that. If doing so
would make us run out of memory for the arrays of poitners to buckets,
just use the last hashtable size that didn't have this problem. That
means we don't have to do any shenanigans to have powers of two for
nbuckets because we don't do clamping.

Also, I think the outer loop needs the condition nbatch > 1 not nbatch
> 0 -- when nbatch is 1, once we divide it by 2, it would end up as 0.

I'm still waiting for the 4billion row table to be created on my
machine, so I haven't verified that I get the same results as you yet.

> - 0004 rewords the comment explaining how the balancing works. Reading
> it after a couple months, I found it overly verbose / not very clear.
> I'm sure it could be improved even further.

My attached patch does build on your revised wording.

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Should we update the random_page_cost default value?
Next
From: "Matheus Alcantara"
Date:
Subject: Re: Optimize LISTEN/NOTIFY