Home > mailing lists

Re: Use generation context to speed up tuplesorts - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Use generation context to speed up tuplesorts
Date	August 6, 2021 16:07:27
Msg-id	CAApHDvqMyMQc9b-mBnGvqsudfVysgD4Xz7c7LsGrP524bsv47w@mail.gmail.com Whole thread Raw
In response to	Re: Use generation context to speed up tuplesorts (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses	Re: Use generation context to speed up tuplesorts
List	pgsql-hackers

Tree view

On Wed, 4 Aug 2021 at 02:10, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
> A review would be nice, although it can wait - It'd be interesting to
> know if those patches help with the workload(s) you've been looking at.

I tried out the v2 set of patches using the attached scripts.  The
attached spreadsheet includes the original tests and compares master
with the patch which uses the generation context vs that patch plus
your v2 patch.

I've also included 4 additional tests, each of which starts with a 1
column table and then adds another 32 columns testing the performance
after adding each additional column. I did this because I wanted to
see if the performance was more similar to master when the allocations
had less power of 2 wastage from allocset. If, for example, you look
at row 123 of the spreadsheet you can see both patched and unpatched
the allocations were 272 bytes each yet there was still a 50%
performance improvement with just the generation context patch when
compared to master.

Looking at the spreadsheet, you'll also notice that in the 2 column
test of each of the 4 new tests the number of bytes used for each
allocation is larger with the generation context. 56 vs 48.  This is
due to the GenerationChunk struct size being later than the Allocset's
version by 8 bytes.  This is because it also holds the
GenerationBlock.  So with the patch there are some cases where we'll
use slightly more memory.

Additional tests:

1. Sort 10000 tuples on a column with values 0-99 in memory.
2. As #1 but with 1 million tuples.
3 As #1 but with a large OFFSET to remove the overhead of sending to the client.
4. As #2 but with a large OFFSET.

Test #3 above is the most similar one to the original tests and shows
similar gains. When the sort becomes larger (1 million tuple test),
the gains reduce. This indicates the gains are coming from improved
CPU cache efficiency from the removal of the power of 2 wastage in
memory allocations.

All of the tests show that the patches to improve the allocation
efficiency of generation.c don't help to improve the results of the
test cases. I wondered if it's maybe worth trying to see what happens
if instead of doubling the allocations each time, quadruple them
instead. I didn't try this.

David

Attachment

pgsql-hackers by date:

From: Andrew Dunstan
Date: 06 August 2021, 15:48:25
Subject: Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

From: vignesh C
Date: 06 August 2021, 16:19:08
Subject: Re: Added schema level support for publication.

Re: Use generation context to speed up tuplesorts - Mailing list pgsql-hackers

Attachment

Previous

Next