Re: Polyphase merge is obsolete - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Polyphase merge is obsolete
Date
Msg-id 882e7ef2-ae13-d085-c2e8-0b75b931f7b7@iki.fi
Whole thread Raw
In response to Re: Polyphase merge is obsolete  (John Naylor <john.naylor@enterprisedb.com>)
Responses Re: Polyphase merge is obsolete  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-hackers
On 05/10/2021 20:24, John Naylor wrote:
> I've had a chance to review and test out the v5 patches.

Thanks! I fixed the stray reference to PostgreSQL 14 that Zhihong 
mentioned, and pushed.

> I've done some performance testing of master versus both patches 
> applied. The full results and test script are attached, but I'll give a 
> summary here. A variety of value distributions were tested, with 
> work_mem from 1MB to 16MB, plus 2GB which will not use external sort at 
> all. I settled on 2 million records for the sort, to have something 
> large enough to work with but also keep the test time reasonable. That 
> works out to about 130MB on disk. We have recent improvements to datum 
> sort, so I used both single values and all values in the SELECT list.
> 
> The system was on a Westmere-era Xeon with gcc 4.8. pg_prewarm was run 
> on the input tables. The raw measurements were reduced to the minimum of 
> five runs.
> 
> I can confirm that sort performance is improved with small values of 
> work_mem. That was not the motivating reason for the patch, but it's a 
> nice bonus. Even as high as 16MB work_mem, it's possible some of the 
> 4-6% differences represent real improvement and not just noise or binary 
> effects, but it's much more convincing at 4MB and below, with 25-30% 
> faster with non-datum integer sorts at 1MB work_mem. The nominal 
> regressions seem within the noise level, with one exception that only 
> showed up in one set of measurements (-10.89% in the spreadsheet). I'm 
> not sure what to make of that since it only happens in one combination 
> of factors and nowhere else.

That's a bit odd, but given how many data points there are, I think we 
can write it off as random noise.

- Heikki



pgsql-hackers by date:

Previous
From: wenjing
Date:
Subject: Re: [Proposal] Global temporary tables
Next
From: vignesh C
Date:
Subject: Re: Added schema level support for publication.