Re: Simplify newNode() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Simplify newNode()
Date
Msg-id 2872058.1702593854@sss.pgh.pa.us
Whole thread Raw
In response to Re: Simplify newNode()  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Simplify newNode()
Re: Simplify newNode()
Re: Simplify newNode()
List pgsql-hackers
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> On 14/12/2023 10:32, Peter Eisentraut wrote:
>> But if we think that compilers are now smart enough, maybe we can unwind
>> this whole stack a bit more?  Maybe we don't need MemSetTest() and/or
>> palloc0fast() and/or newNode() at all?

> Good point. Looking closer, modern compilers will actually turn the 
> MemSetLoop() in MemoryContextAllocZeroAligned() into a call to memset() 
> anyway! Funny. That is true for recent versions of gcc, clang, and MSVC. 

I experimented with the same planner-intensive test case that I used
in the last discussion back in 2008.  I got these results:

HEAD:
tps = 144.974195 (without initial connection time)

v1 patch:
tps = 146.302004 (without initial connection time)

v2 patch:
tps = 144.882208 (without initial connection time)

While there's not much daylight between these numbers, the times are
quite reproducible for me.  This is with RHEL8's gcc 8.5.0 on x86_64.
That's probably a bit trailing-edge in terms of what people might be
using with v17, but I don't think it's discountable.

I also looked at the backend's overall code size per size(1):

HEAD:
   text    data     bss     dec     hex filename
8613007  100192  220176 8933375  884fff testversion.stock/bin/postgres

v1 patch:
   text    data     bss     dec     hex filename
8615126  100192  220144 8935462  885826 testversion.v1/bin/postgres

v2 patch:
   text    data     bss     dec     hex filename
8595322  100192  220144 8915658  880aca testversion.v2/bin/postgres

I did check that the v1 patch successfully inlines newNode() and
reduces it to just a MemoryContextAllocZeroAligned call, so it's
correct that modern compilers do that better than whatever I tested
in 2008.  But I wonder what is happening in v2 to reduce the code
size so much.  MemoryContextAllocZeroAligned is not 20kB by itself.

> Good point. Looking closer, modern compilers will actually turn the 
> MemSetLoop() in MemoryContextAllocZeroAligned() into a call to memset() 
> anyway! Funny. That is true for recent versions of gcc, clang, and MSVC. 

Not here ...

> Yeah, +1 on removing all that, including MemoryContextAllocZeroAligned. 
> It's not doing any good as it is, as it gets compiled to be identical to 
> MemoryContextAllocZero.

Also not so here.  Admittedly, my results don't make much of a case
for keeping the two code paths, even on compilers where it matters.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: useless LIMIT_OPTION_DEFAULT
Next
From: Thomas Munro
Date:
Subject: Re: Simplify newNode()