Re: B-Tree index builds, CLUSTER, and sortsupport - Mailing list pgsql-hackers

From Andreas Karlsson
Subject Re: B-Tree index builds, CLUSTER, and sortsupport
Date
Msg-id 545AC1D9.1040009@proxel.se
Whole thread Raw
In response to B-Tree index builds, CLUSTER, and sortsupport  (Peter Geoghegan <pg@heroku.com>)
Responses Re: B-Tree index builds, CLUSTER, and sortsupport  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On 10/11/2014 02:26 AM, Peter Geoghegan wrote:> Both Robert and Heikki 
expressed dissatisfaction with the fact that> B-Tree index builds don't use sortsupport. Because B-Tree index builds>
cannotreally use the "onlyKey" optimization, the historic oversight> of not supporting B-Tree builds (and CLUSTER)
mighthave been at least> a little understandable [1]. But with the recent work on sortsupport> for text, it's likely
thatB-Tree index builds will be missing out on> rather a lot by not taking advantage of sortsupport.>> Attached patch
modifiestuplesort to correct this oversight. What's> really nice about it is that there is actually a net negative
code>footprint:>>   src/backend/access/nbtree/nbtsort.c  |  63 +++--->   src/backend/utils/sort/sortsupport.c |  77
++++++-->  src/backend/utils/sort/tuplesort.c   | 274 +++++++++++---------------->   src/include/utils/sortsupport.h
 |   3 +>   4 files changed, 203 insertions(+), 214 deletions(-)
 

The code compiles and passes the test suite.

I looked at the changes to the code. The new code is clean and there is 
more code re-use and improved readability. On possible further 
improvement would be to move the preparation of SortSupport to a common 
function since this is done three time in the code.

I did some simple benchmarks by adding indexes to temporary tables and 
could see improvements of around 10% in index build time. So it gives a 
nice, but not amazing, performance improvement.

Is there any case where we should expect any greater performance 
improvement?

Either way I find this a nice patch which improves code quality and 
performance.

= Minor code style issues I found

- There is a double space in "strategy  = (scanKey->sk_flags [...]".
- I think there should be a newline in tuplesort_begin_index_btree() 
before "/* Prepare SortSupport data for each column */".
- Remove the extra newline after reversedirection().
- End sentences in comments with period. That seems to be the common 
practice in the project.

-- 
Andreas Karlsson



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: recovery_target_time and standby_mode
Next
From: Stephen Frost
Date:
Subject: Re: numeric_normalize() is a few bricks shy of a load