Thread: pgsql: Improve accounting for memory used by shared hash tables
Improve accounting for memory used by shared hash tables pg_shmem_allocations tracks the memory allocated by ShmemInitStruct(), but for shared hash tables that covered only the header and hash directory. The remaining parts (segments and buckets) were allocated later using ShmemAlloc(), which does not update the shmem accounting. Thus, these allocations were not shown in pg_shmem_allocations. This commit improves the situation by allocating all the hash table parts at once, using a single ShmemInitStruct() call. This way the ShmemIndex entries (and thus pg_shmem_allocations) better reflect the proper size of the hash table. This affects allocations for private (non-shared) hash tables too, as the hash_create() code is shared. For non-shared tables this however makes no practical difference. This changes the alignment a bit. ShmemAlloc() aligns the chunks using CACHELINEALIGN(), which means some parts (header, directory, segments) were aligned this way. Allocating all parts as a single chunk removes this (implicit) alignment. We've considered adding explicit alignment, but we've decided not to - it seems to be merely a coincidence due to using the ShmemAlloc() API, not due to necessity. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2L28vHzRankszhqz7deXURxKncxfirnuW68zD7+hVAqaS5GQ@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/f5930f9a98ea65d659d41600a138e608988ad122 Modified Files -------------- src/backend/storage/ipc/shmem.c | 4 +- src/backend/utils/hash/dynahash.c | 281 +++++++++++++++++++++++++++++--------- src/include/utils/hsearch.h | 3 +- 3 files changed, 222 insertions(+), 66 deletions(-)
On Thu, 3 Apr 2025 at 04:16, Tomas Vondra <tomas.vondra@postgresql.org> wrote: > Improve accounting for memory used by shared hash tables I've not looked into why, but this is causing an issue in the join_rel_hash during add_join_rel(). See the attached script. ERROR: pfree called with invalid pointer 0x60a15edc44e0 (header 0x0000002000000008) bisecting found this commit. #0 BogusFree (pointer=0x5b36cd85a5e8) at ../src/backend/utils/mmgr/mcxt.c:288 __func__ = "BogusFree" #1 0x00005b36c3d70e74 in dir_realloc (hashp=0x5b36cd4c0aa0) at ../src/backend/utils/hash/dynahash.c:1787 new_dirsize = <optimized out> p = 0x5b36cd4c0b30 old_p = 0x5b36cd85a5e8 new_dsize = <optimized out> old_dirsize = <optimized out> p = <optimized out> old_p = <optimized out> new_dsize = <optimized out> old_dirsize = <optimized out> new_dirsize = <optimized out> _vstart = <optimized out> _val = <optimized out> _len = <optimized out> _start = <optimized out> _stop = <optimized out> #2 expand_table (hashp=0x5b36cd4c0aa0) at ../src/backend/utils/hash/dynahash.c:1691 hctl = 0x5b36cd85a298 old_bucket = <optimized out> newlink = <optimized out> nextElement = <optimized out> old_seg = <optimized out> new_bucket = 65536 new_segnum = 256 new_segndx = 0 currElement = <optimized out> new_seg = <optimized out> old_segnum = <optimized out> old_segndx = <optimized out> oldlink = <optimized out> hctl = <optimized out> old_seg = <optimized out> new_seg = <optimized out> old_bucket = <optimized out> new_bucket = <optimized out> new_segnum = <optimized out> new_segndx = <optimized out> old_segnum = <optimized out> old_segndx = <optimized out> oldlink = <optimized out> newlink = <optimized out> currElement = <optimized out> nextElement = <optimized out> #3 hash_search_with_hash_value (hashp=0x5b36cd4c0aa0, keyPtr=0x5b3716ccc6f0, hashvalue=2952019273, action=<optimized out>, foundPtr=<optimized out>) at ../src/backend/utils/hash/dynahash.c:1112 hctl = 0x5b36cd85a298 freelist_idx = 0 keysize = <optimized out> currBucket = <optimized out> prevBucketPtr = <optimized out> match = <optimized out> __func__ = "hash_search_with_hash_value" #4 0x00005b36c3d71031 in hash_search (hashp=<optimized out>, keyPtr=<optimized out>, action=<optimized out>, foundPtr=<optimized out>) at ../src/backend/utils/hash/dynahash.c:1069 No locals. #5 0x00005b36c3b195b3 in add_join_rel (root=0x5b36cd420d28, joinrel=0x5b3716ccc6e8) at ../src/backend/optimizer/util/relnode.c:638 hentry = <optimized out> found = false David
Attachment
On 4/4/25 00:57, David Rowley wrote: > On Thu, 3 Apr 2025 at 04:16, Tomas Vondra <tomas.vondra@postgresql.org> wrote: >> Improve accounting for memory used by shared hash tables > > I've not looked into why, but this is causing an issue in the > join_rel_hash during add_join_rel(). See the attached script. > > ERROR: pfree called with invalid pointer 0x60a15edc44e0 (header > 0x0000002000000008) > Thanks for the report and reproducer. I'll take a look tomorrow. -- Tomas Vondra
On 4/4/25 01:43, Tomas Vondra wrote: > On 4/4/25 00:57, David Rowley wrote: >> On Thu, 3 Apr 2025 at 04:16, Tomas Vondra <tomas.vondra@postgresql.org> wrote: >>> Improve accounting for memory used by shared hash tables >> >> I've not looked into why, but this is causing an issue in the >> join_rel_hash during add_join_rel(). See the attached script. >> >> ERROR: pfree called with invalid pointer 0x60a15edc44e0 (header >> 0x0000002000000008) >> > > Thanks for the report and reproducer. I'll take a look tomorrow. > I took a quick look, and I think the reason is fairly simple - the commit allocates the header and the directory as a single chunk. And for shared hash tables that's fine, because those have non-expandable directory. But the patch does the same thing for non-shared hash tables (not intentionally), which means that if we end up expanding the hash, it fails in dir_realloc(). Because hashp->dir is not a separately allocated chunk. This is clearly a bug in the patch, I should have caught this during a review. But I'm also quite surprised none of the regression tests seems to expand the hash table ... I'll think about a way to fix this tomorrow. -- Tomas Vondra
On 4/4/25 02:41, Tomas Vondra wrote: > On 4/4/25 01:43, Tomas Vondra wrote: >> On 4/4/25 00:57, David Rowley wrote: >>> On Thu, 3 Apr 2025 at 04:16, Tomas Vondra <tomas.vondra@postgresql.org> wrote: >>>> Improve accounting for memory used by shared hash tables >>> >>> I've not looked into why, but this is causing an issue in the >>> join_rel_hash during add_join_rel(). See the attached script. >>> >>> ERROR: pfree called with invalid pointer 0x60a15edc44e0 (header >>> 0x0000002000000008) >>> >> >> Thanks for the report and reproducer. I'll take a look tomorrow. >> > > I took a quick look, and I think the reason is fairly simple - the > commit allocates the header and the directory as a single chunk. And for > shared hash tables that's fine, because those have non-expandable > directory. But the patch does the same thing for non-shared hash tables > (not intentionally), which means that if we end up expanding the hash, > it fails in dir_realloc(). Because hashp->dir is not a separately > allocated chunk. > > This is clearly a bug in the patch, I should have caught this during a > review. But I'm also quite surprised none of the regression tests seems > to expand the hash table ... > > I'll think about a way to fix this tomorrow. > I ended up reverting this. Unfortunately, the patch assumed the directory is pre-allocated and not expanding in more places. I wasn't sure how long would it take me to fix this, or how invasive the fix would be. It seems more appropriate to revert and then maybe apply a reworked patch (not going to happen for PG18). Thanks for the report, sorry for missing the issue in the first place. -- Tomas Vondra