Re: A potential memory leak on Merge Join when Sort node is not below Materialize node - Mailing list pgsql-hackers

From David Rowley
Subject Re: A potential memory leak on Merge Join when Sort node is not below Materialize node
Date
Msg-id CAApHDvpxQsTX0rYceTQYeSqUZzpyFQkFVKOV=UUB68=NLcJAFg@mail.gmail.com
Whole thread Raw
In response to Re: A potential memory leak on Merge Join when Sort node is not below Materialize node  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: A potential memory leak on Merge Join when Sort node is not below Materialize node
List pgsql-hackers
On Thu, 29 Sept 2022 at 12:07, Peter Geoghegan <pg@bowt.ie> wrote:
> Also potentially relevant: the 2017 commit fa117ee4 anticipated adding
> a "copy" argument to tuplesort_getdatum() (the same commit added such
> a "copy" argument to tuplesort_gettupleslot()). I see that that still
> hasn't happened to tuplesort_getdatum() all these years later. Might
> be a good idea to do it in the next year or two, though.
>
> If David is interested in pursuing this now then I certainly won't object.

Just while this is fresh in my head, I wrote some code to make this
happen.  My preference would be not to add the "copy" param to the
existing function and instead just add a new function to prevent
additional branching.

The attached puts back the datum sort in nodeSort.c for byref types
and adjusts process_ordered_aggregate_single() to make use of this
function.

I did a quick benchmark to see if this help DISTINCT aggregate any:

create table t1 (a varchar(32) not null, b varchar(32) not null);
insert into t1 select md5((x%10)::text),md5((x%10)::text) from
generate_Series(1,1000000)x;
vacuum freeze t1;
create index on t1(a);

With a work_mem of 256MBs I get:

query = select max(distinct a), max(distinct b) from t1;

Master:
latency average = 313.197 ms

Patched:
latency average = 304.335 ms

So not a very impressive speedup there (about 3%)

Some excerpts from perf top show:

Master:
   1.40%  postgres          [.] palloc
   1.13%  postgres          [.] tuplesort_getdatum
   0.77%  postgres          [.] datumCopy

Patched:
   0.91%  postgres          [.] tuplesort_getdatum_nocopy
   0.65%  postgres          [.] palloc

I stared for a while at the mode_final() function and thought maybe we
could use the nocopy variant there. I just didn't quite pluck up the
motivation to write any code to see if it could be made faster.

David

Attachment

pgsql-hackers by date:

Previous
From: Zhang Mingli
Date:
Subject: Re: Refactor UnpinBuffer()
Next
From: Peter Geoghegan
Date:
Subject: Re: A potential memory leak on Merge Join when Sort node is not below Materialize node