Thread: Small cleanups to tuplesort.c and a bonus small performance improvement
Hi, Since doing some work for PG15 for speeding up sorts, I've been a little irritated by the fact that dumptuples() calls WRITETUP, (which is now always calling writetuple()) and calls pfree() on the tuple only for dumptuples() to do MemoryContextReset(state->base.tuplecontext) directly afterwards. These pfrees are just a waste of effort and we might as well leave it to the context reset to do the cleanup. (Probably especially so when using AllocSet for storing tuples) There are only 2 calls to WRITETUP and the other one is always called when state->slabAllocatorUsed is true. writetuple() checks for that before freeing the tuple, which is a bit of a wasted branch since it'll always prove to be false for the use case in mergeonerun(). (It's possible the compiler might inline that now anyway since the WRITETUP macro always calls writetuple() directly now) I've attached 3 patches aimed to do a small amount of cleanup work in tuplesort.c 0001: Just fixes a broken looking comment in writetuple() 0002: Gets rid of the WRITETUP marco. That does not do anything useful since 097366c45 0003: Changes writetuple to tell it what it should do in regards to freeing and adjusting the memory accounting. Probably 0003 could be done differently. I'm certainly not set on the bool args. I understand that I'm never calling it with "freetup" == true. So other options include 1) rip out the pfree code and that parameter; or 2) just do the inlining manually at both call sites. I'll throw this in the September CF to see if anyone wants to look. There's probably lots more cleaning jobs that could be done in tuplesort.c. The performance improvement from 0003 is not that impressive, but it looks like it makes things very slightly faster, so probably worth it if the patch makes the code cleaner. See attached gif and script for the benchmark I ran to test it. I think the gains might go up slightly with [1] applied as that patch seems to do more to improve the speed of palloc() than it does to improve the speed of pfree(). David [1] https://commitfest.postgresql.org/39/3810/
Attachment
On Fri, 26 Aug 2022 at 16:48, David Rowley <dgrowleyml@gmail.com> wrote: > 0003: Changes writetuple to tell it what it should do in regards to > freeing and adjusting the memory accounting. > > Probably 0003 could be done differently. I'm certainly not set on the > bool args. I understand that I'm never calling it with "freetup" == > true. So other options include 1) rip out the pfree code and that > parameter; or 2) just do the inlining manually at both call sites. This patch series needed to be rebased and on looking it at again, since the pfree() code is never used I felt it makes very little sense to keep it, so I decided that it might be better just to keep the WRITETUP macro and just completely get rid of the writetuple function and have the macro call the function pointed to be the "writetup" pointer. The only extra code we needed from writetuple() was the memory accounting code which was only used in dumptuples(), so I've just included that code in that function instead. I also noticed that dumptuples() had a pretty braindead method of zeroing out state->memtupcount by subtracting 1 from it on each loop. Since that's not being used to keep track of the loop's progress, I've just moved it out the loop and changed the code to set it to 0 once the loop is done. > I'll throw this in the September CF to see if anyone wants to look. > There's probably lots more cleaning jobs that could be done in > tuplesort.c. My current thoughts are that this is a very trivial patch and unless there's any objections I plan to push it soon. David
Attachment
On Wed, 31 Aug 2022 at 22:39, David Rowley <dgrowleyml@gmail.com> wrote: > My current thoughts are that this is a very trivial patch and unless > there's any objections I plan to push it soon. Pushed. David