Re: Use generation context to speed up tuplesorts - Mailing list pgsql-hackers

From David Rowley
Subject Re: Use generation context to speed up tuplesorts
Date
Msg-id CAApHDvpgA41E1YvTQeJ=meYpCa3M-wNvt--NCjb6URX+Tg0F_w@mail.gmail.com
Whole thread Raw
In response to Re: Use generation context to speed up tuplesorts  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Mon, 9 Aug 2021 at 00:38, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
> I'm not sure quadrupling the size is a good idea, though, because it
> increases the amount of memory we might be wasting. With the doubling,
> the amount of wasted /unused memory is limited to ~50%, because the next
> block is (roughly) equal to sum of already allocated blocks, so
> allocating just 1B on it leaves us with 50%. But quadrupling the size
> means we'll end up with ~75% free space. Of course, this is capped by
> the maximum block size etc. but still ...

Yeah, not sure what is best. It does however seem likely that the
majority of the performance improvement that I saw is due to either
malloc()/free() calls or just having fewer blocks in the context.

Maybe it's worth getting the planner on board with deciding how to do
the allocations.  It feels a bit overcautious to go allocating blocks
in each power of two starting at 8192 bytes when doing a 1GB sort.
Maybe we should be looking towards doing something more like making
the init allocation size more like pg_prevpower2_64(Min(work_mem *
1024L, sort_tuples * tuple_width)), or maybe half or quarter that.

It would certainly not be the only executor node to allocate memory
based on what the planner thought. Just look at ExecHashTableCreate().

David



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Use generation context to speed up tuplesorts
Next
From: Pavel Stehule
Date:
Subject: Re: very long record lines in expanded psql output