Re: Reduce TupleHashEntryData struct size by half - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Reduce TupleHashEntryData struct size by half |
Date | |
Msg-id | CAApHDvpN4v3t_sdz4dvrv1Fx_ZPw=twSnxuTEytRYP7LFz5K9A@mail.gmail.com Whole thread Raw |
In response to | Re: Reduce TupleHashEntryData struct size by half (Heikki Linnakangas <hlinnaka@iki.fi>) |
List | pgsql-hackers |
On Thu, 13 Feb 2025 at 14:01, Jeff Davis <pgsql@j-davis.com> wrote: > Attached a new patchset that doesn't change the API for > ExecCopySlotMinimalTuple(). Instead, I'm using > ExecFetchSlotMinimalTuple(), which avoids the extra memcpy if the slot > is TTSOpsMinimalTuple, which is what HashAgg uses. I separated out the > change to use ExecFetchSlotMinimalTuple() into 0004 to illustrate the > performance impact. Some review comments: * my_log2() takes a "long" parameter type but transitionSpace is a "Size". These types aren't the same width on all platforms we support. Maybe pg_nextpower2_size_t() is a better option. * Should the following have MAXALIGN(tupleSize) on the right side of the expression? tupleChunkSize = tupleSize I understand this was missing before, but both bump.c does consume at least MAXALIGN(<req_size>) in BumpAlloc(). * while (16 * maxBlockSize > work_mem * 1024L) The 1024L predates the change made in 041e8b95. "1024L" needs to be replaced with "(Size) 1024" Maybe you could just replace the while loop and the subsequent "if" check with: /* * Like CreateWorkExprContext(), use smaller sizings for smaller work_mem, * to avoid large jumps in memory usage. */ maxBlockSize = pg_prevpower2_size_t(work_mem * (Size) 1024 / 16); /* But no bigger than ALLOCSET_DEFAULT_MAXSIZE */ maxBlockSize = Min(maxBlockSize, ALLOCSET_DEFAULT_MAXSIZE); /* and no smaller than ALLOCSET_DEFAULT_INITSIZE */ maxBlockSize = Max(maxBlockSize, ALLOCSET_DEFAULT_INITSIZE); I believe that gives the same result without the looping. * In hash_create_memory(), can you get rid of the minContextSize and initBlockSize variables? * Is it worth an Assert() theck additionalsize > 0? * The amount of space available is the additionalsize requested in the call * to BuildTupleHashTable(). If additionalsize was specified as zero, no * additional space is available and this function should not be called. */ static inline void * TupleHashEntryGetAdditional(TupleHashTable hashtable, TupleHashEntry entry) { return (char *) entry->firstTuple - hashtable->additionalsize; } Benchmarks: I was also experimenting with the performance of this using the following test case: create table c (a int not null); insert into c select a from generate_Series(1,1000) a; vacuum freeze analyze c; Query: select a,count(*) from c group by a; Average TPS over 10x 10 second runs with -M prepared master: 3653.9853988 v7-0001: 3741.815979 v7-0001+0002: 3737.4313064 v7-0001+0002+0003: 3834.6271706 v7-0001+0002+0004+0004: 3912.1158887 I also tried out changing hash_agg_check_limits() so that it no longer calls MemoryContextMemAllocated and instead uses ->mem_allocated directly and with all the other patches got: v7-0001+0002+0004+0004+extra: 4049.0732381 We probably shouldn't do exactly that as it be better not to access that internal field from outside the memory context code. A static inline function in memutils.h that handles the non-recursive callers might be nice. I've attached my results of running your test in graph form. I also see a small regression for these small scale tests. I wondering if it's worth mocking up some code to see what the performance would be like without the additional memcpy() that's new to LookupTupleHashEntry_internal(). How about hacking something up that adds an additionalsize field to TupleDesc and then set that field to your additional size and have heap_form_minimal_tuple() allocate that much extra memory? David
Attachment
pgsql-hackers by date: