Home > mailing lists

Re: Use generation context to speed up tuplesorts - Mailing list pgsql-hackers

From	Andy Fan
Subject	Re: Use generation context to speed up tuplesorts
Date	February 7, 2022 00:41:25
Msg-id	CAKU4AWq+KRmH9Cb8bsO-+usjxZkn8JsDRGU=fxy8YMFwdD1wOw@mail.gmail.com Whole thread Raw
In response to	Re: Use generation context to speed up tuplesorts (David Rowley <dgrowleyml@gmail.com>)
List	pgsql-hackers

Tree view

Hi:

On Thu, Jan 20, 2022 at 9:31 AM David Rowley <dgrowleyml@gmail.com> wrote:

As of now, I still believe we'll need Tomas' patches to allow the
block size to grow up to a maximum size. I think those patches are
likely needed before we think about making tuplesort use generation
contexts. The reason I believe this is that we don't always have good
estimates for the number of tuples we're going to sort.

I spent some times to study the case here and my current thought is:

we can discuss/commit the minimum committable changes which

should be beneficial for some cases and no harm for others.

Tomas's patch 0002 would make there are no more blocks needed

if we switch to GenerationContext (compared with Standard Context). and

David's patch can obviously reduce total memory usage and improve

the performance. so IMO Tomas's patch 0002 + David's patch is a committable

patchset at first round. and Tomas's 0001 patch would be good to have

as well.

I double checked Tomas's 0002 patch, it looks good to me. and then applied

David's patch with ALLOCSET_DEFAULT_SIZES, testing the same workload.

Here is the result (number is tps):

work_mem = '4GB'

| Test Case | master | patched |
|-----------+--------+---------|
| Test 1 | 306 | 406 |
| Test 2 | 225 | 278 |
| Test 3 | 202 | 248 |
| Test 4 | 184 | 218 |
| Test 5 | 281 | 360 |

work_mem = '4MB'

| Test Case | master | patched |
|-----------+--------+---------|
| Test 1 | 124 | 409 |
| Test 2 | 106 | 280 |
| Test 3 | 100 | 249 |
| Test 4 | 97 | 218 |
| Test 5 | 120 | 369 |

I didn't get the performance improvement as much as David's at the beginning, I

think that is because David uses the ALLOCSET_DEFAULT_MAXSIZE directly which

will need less number of times for memory allocation.

AFAICS, Tomas's patch 0002 + David's patch should be ready for commit for round 1.

We can try other opportunities like use rows estimation to allocate initial memory and

GenerationContext improves like 0003/0004. Would this work?

Best Regards

Andy Fan

pgsql-hackers by date:

From: Andrew Dunstan
Date: 06 February 2022, 23:57:03
Subject: Re: [RFC] building postgres with meson

From: "Jonathan S. Katz"
Date: 07 February 2022, 01:30:34
Subject: Re: Release notes for February minor releases

Re: Use generation context to speed up tuplesorts - Mailing list pgsql-hackers

Previous

Next