Small cleanups to tuplesort.c and a bonus small performance improvement - Mailing list pgsql-hackers

From David Rowley
Subject Small cleanups to tuplesort.c and a bonus small performance improvement
Date
Msg-id CAApHDvqZXoDCyrfCzZJR0-xH+7_q+GgitcQiYXUjRani7h4j8Q@mail.gmail.com
Whole thread Raw
Responses Re: Small cleanups to tuplesort.c and a bonus small performance improvement
List pgsql-hackers
Hi,

Since doing some work for PG15 for speeding up sorts, I've been a
little irritated by the fact that dumptuples() calls WRITETUP, (which
is now always calling writetuple()) and calls pfree() on the tuple
only for dumptuples() to do
MemoryContextReset(state->base.tuplecontext) directly afterwards.
These pfrees are just a waste of effort and we might as well leave it
to the context reset to do the cleanup. (Probably especially so when
using AllocSet for storing tuples)

There are only 2 calls to WRITETUP and the other one is always called
when state->slabAllocatorUsed is true.  writetuple() checks for that
before freeing the tuple, which is a bit of a wasted branch since
it'll always prove to be false for the use case in mergeonerun().
(It's possible the compiler might inline that now anyway since the
WRITETUP macro always calls writetuple() directly now)

I've attached 3 patches aimed to do a small amount of cleanup work in
tuplesort.c

0001: Just fixes a broken looking comment in writetuple()
0002: Gets rid of the WRITETUP marco. That does not do anything useful
since 097366c45
0003: Changes writetuple to tell it what it should do in regards to
freeing and adjusting the memory accounting.

Probably 0003 could be done differently. I'm certainly not set on the
bool args. I understand that I'm never calling it with "freetup" ==
true. So other options include 1) rip out the pfree code and that
parameter; or 2) just do the inlining manually at both call sites.

I'll throw this in the September CF to see if anyone wants to look.
There's probably lots more cleaning jobs that could be done in
tuplesort.c.

The performance improvement from 0003 is not that impressive, but it
looks like it makes things very slightly faster, so probably worth it
if the patch makes the code cleaner. See attached gif and script for
the benchmark I ran to test it.  I think the gains might go up
slightly with [1] applied as that patch seems to do more to improve
the speed of palloc() than it does to improve the speed of pfree().

David

[1] https://commitfest.postgresql.org/39/3810/

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Handle infinite recursion in logical replication setup
Next
From: Nathan Bossart
Date:
Subject: Re: use ARM intrinsics in pg_lfind32() where available