Re: [PATCH] Use optimized single-datum tuplesort in ExecSort - Mailing list pgsql-hackers

From David Rowley
Subject Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Date
Msg-id CAApHDvp3a5Kv=6D+uADGUWrcTRC_QL9ZbRMqRi7h+PVKy3mPMA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use optimized single-datum tuplesort in ExecSort  (Ronan Dunklau <ronan.dunklau@aiven.io>)
Responses Re: [PATCH] Use optimized single-datum tuplesort in ExecSort  (Ronan Dunklau <ronan.dunklau@aiven.io>)
List pgsql-hackers
On Wed, 7 Jul 2021 at 21:32, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:
> In the meantime I fixed some formatting issues, please find attached a new
> patch.

I started to look at this.

First I wondered how often we might be able to apply this
optimisation, so I ran make check after adding some elog(NOTICE) calls
to output which method is going to be used just before we do the
tuplestore_begin_* calls.  It looks like there are 614 instances of
Datum sorts and 4223 of tuple sorts. That's about 14.5% datum sorts.
223 of the 614 are byval types and the other 391 are byref. Not that
the regression tests are a good reflection of the real world, but if
it were then that's quite a good number of cases to be able to
optimise.

As for the patch, just a few things:

1. Can you add the missing braces in this if condition and the else
condition that belongs to it.

+ if (node->is_single_val)
+ for (;;)
+ {

2. I think it would nicer to name the new is_single_val field
"datumSort" instead. To me it seems more clear what it is for.

3. This seems to be a bug fix where byval datum sorts do not properly
handle bounded sorts.  I think that maybe that should be fixed and
backpatched.  I don't see anything that says Datum sorts can't be
bounded and if there were some restriction on that I'd expect
tuplesort_set_bound() to fail when the Tuplesortstate had been set up
with tuplesort_begin_datum().

 static void
 free_sort_tuple(Tuplesortstate *state, SortTuple *stup)
 {
- FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
- pfree(stup->tuple);
+ /*
+ * If the SortTuple is actually only a single Datum, which was not copied
+ * as it is a byval type, do not try to free it nor account for it in
+ * memory used.
+ */
+ if (stup->tuple)
+ {
+ FREEMEM(state, GetMemoryChunkSpace(stup->tuple));
+ pfree(stup->tuple);
+ }

I can take this to another thread.

That's all I have for now.

David



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: pg_stats and range statistics
Next
From: David Rowley
Date:
Subject: Is tuplesort meant to support bounded datum sorts?