Re: Avoid multiple calls to memcpy (src/backend/access/index/genam.c) - Mailing list pgsql-hackers

From Ranier Vilela
Subject Re: Avoid multiple calls to memcpy (src/backend/access/index/genam.c)
Date
Msg-id CAEudQArY+Kb0EjL1EwdbSecerJ4DsH=ywcBA7_X7eVDrEwdVWQ@mail.gmail.com
Whole thread
In response to Re: Avoid multiple calls to memcpy (src/backend/access/index/genam.c)  (Bryan Green <dbryan.green@gmail.com>)
List pgsql-hackers


Em qui., 12 de mar. de 2026 às 16:21, Bryan Green <dbryan.green@gmail.com> escreveu:
I modified your memcpy1.c program to not inline the version functions.  I changed the memcpy function
call in version 1, added volatile to keep some DCE opportunities from happening and added a range
of N values to keep the compiler from specializing the code for N = 4.  Before it did DCE and the test1 
function was just a ret.

The interesting issue is the use of malloc versus the stack.  The use of malloc will probably track closer
with PG's use of palloc so I would say in that case this is an optimization.  It might be fun to compile PG
with and without the patch (in debug mode) and actually see what gets generated for this function.

Here are the results I got using your modified benchmark:
--- stack allocated ---
stack  n=1  v1(patch): 49721599 ns  v2(original): 21477302 ns  ratio: 2.315  original wins
stack  n=2  v1(patch): 52065462 ns  v2(original): 28765199 ns  ratio: 1.810  original wins
stack  n=3  v1(patch): 58914958 ns  v2(original): 39726110 ns  ratio: 1.483  original wins
stack  n=4  v1(patch): 64585275 ns  v2(original): 47046397 ns  ratio: 1.373  original wins
stack  n=5  v1(patch): 73929844 ns  v2(original): 58588698 ns  ratio: 1.262  original wins
stack  n=6  v1(patch): 95465376 ns  v2(original): 67807817 ns  ratio: 1.408  original wins
stack  n=7  v1(patch): 86910226 ns  v2(original): 76999488 ns  ratio: 1.129  original wins
stack  n=8  v1(patch): 107765417 ns  v2(original): 86046016 ns  ratio: 1.252  original wins

--- malloc allocated ---
malloc n=1  v1(patch): 133283824 ns  v2(original): 141361091 ns  ratio: 0.943  patch wins
malloc n=2  v1(patch): 145625895 ns  v2(original): 180912711 ns  ratio: 0.805  patch wins
malloc n=3  v1(patch): 153975594 ns  v2(original): 228459879 ns  ratio: 0.674  patch wins
malloc n=4  v1(patch): 154483094 ns  v2(original): 248157408 ns  ratio: 0.623  patch wins
malloc n=5  v1(patch): 157710598 ns  v2(original): 298795018 ns  ratio: 0.528  patch wins
malloc n=6  v1(patch): 165196636 ns  v2(original): 332940132 ns  ratio: 0.496  patch wins
malloc n=7  v1(patch): 169576370 ns  v2(original): 358438778 ns  ratio: 0.473  patch wins
malloc n=8  v1(patch): 184463815 ns  v2(original): 403721513 ns  ratio: 0.457  patch wins
Thanks for your attention and tests.

I think that patch can continue then.

best regards,
Ranier Vilela

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Better shared data structure management and resizable shared data structures
Next
From: Tomas Vondra
Date:
Subject: Re: Why clearing the VM doesn't require registering vm buffer in wal record