Re: Reduce TupleHashEntryData struct size by half - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Reduce TupleHashEntryData struct size by half
Date
Msg-id 81791dedec03d750c6fc06369a465fa74031adc7.camel@j-davis.com
Whole thread Raw
In response to Re: Reduce TupleHashEntryData struct size by half  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Fri, 2025-02-28 at 17:09 +1300, David Rowley wrote:
> * my_log2() takes a "long" parameter type but transitionSpace is a
> "Size". These types aren't the same width on all platforms we
> support.
> Maybe pg_nextpower2_size_t() is a better option.

Done.

> * Should the following have MAXALIGN(tupleSize) on the right side of
> the expression?

Done.

> Maybe you could just replace the while loop and the subsequent "if"
> check with:

Done.

> * In hash_create_memory(), can you get rid of the minContextSize and
> initBlockSize variables?

Done.

> * Is it worth an Assert() theck additionalsize > 0?

One caller happens to call it unconditionally. It seems better to
return NULL if additionalsize == 0.

> create table c (a int not null);
> insert into c select a from generate_Series(1,1000) a;
> vacuum freeze analyze c;

The schema you posted is the narrower table, and your numbers better
match the wide table you posted before. Was there a mixup?

> master: 3653.9853988
> v7-0001: 3741.815979
> v7-0001+0002: 3737.4313064
> v7-0001+0002+0003: 3834.6271706
> v7-0001+0002+0004+0004: 3912.1158887
>
> I also tried out changing hash_agg_check_limits() so that it no
> longer
> calls MemoryContextMemAllocated and instead uses ->mem_allocated
> directly and with all the other patches got:
>
> v7-0001+0002+0004+0004+extra: 4049.0732381

Great!

> We probably shouldn't do exactly that as it be better not to access
> that internal field from outside the memory context code.  A static
> inline function in memutils.h that handles the non-recursive callers
> might be nice.

Both the metacxt as well as the context used for byref transition
values can have child contexts, so we should keep the recursion. I just
inlined MemoryContextMemAllocated() and MemoryContextTraverseNext().

> I've attached my results of running your test in graph form.

Thank you!

My results (with wide tables):

                        GROUP BY      EXCEPT
   master:                  2151        1732
   entire v8 series:        2054        1740

In some of the patches in the middle of the series, I ran into some
hard-to-explain regressions, so consider my results preliminary. I may
need to profile and figure out what's going on. I didn't see any
overall regression.

But the series overall seems about even, while the memory consumption
is ~35% lower for the example I posted in the first message in the
thread.


>   How about hacking something up that
> adds an additionalsize field to TupleDesc and then set that field to
> your additional size and have heap_form_minimal_tuple() allocate that
> much extra memory?

I assume we wouldn't want to actually add a field to TupleDescData,
right?

When I reworked the ExecCopySlotMinimalTupleExtra() API to place the
extra memory before the tuple, it worked out to be a bit cleaner with
fewer special cases, so I'm fine with that API now.

Regards,
    Jeff Davis


Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: per backend WAL statistics
Next
From: James Hunter
Date:
Subject: Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators