Re: WIP: further sorting speedup - Mailing list pgsql-patches

From Tom Lane
Subject Re: WIP: further sorting speedup
Date
Msg-id 16157.1140409142@sss.pgh.pa.us
Whole thread Raw
In response to Re: WIP: further sorting speedup  ("Luke Lonergan" <llonergan@greenplum.com>)
Responses Re: WIP: further sorting speedup
List pgsql-patches
"Luke Lonergan" <llonergan@greenplum.com> writes:
> So you know, we=B9ve done some more work on the external sort to remove the
> =B3tape=B2 abstraction from the code, which makes a significant improvement.

Improvement where?  That code's down in the noise so far as I can tell.
I see results like this (with the patched code):

CPU: P4 / Xeon with 2 hyper-threads, speed 2793.08 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped)
with a unit mask of 0x01 (mandatory) count 240000
samples  %        symbol name
147310   31.9110  tuplesort_heap_siftup
68381    14.8130  comparetup_index
34063     7.3789  btint4cmp
22573     4.8899  AllocSetAlloc
19317     4.1845  writetup_index
18953     4.1057  tuplesort_gettuple_common
18100     3.9209  mergepreread
17083     3.7006  GetMemoryChunkSpace
12527     2.7137  LWLockAcquire
11686     2.5315  LWLockRelease
6172      1.3370  tuplesort_heap_insert
5392      1.1680  index_form_tuple
5323      1.1531  PageAddItem
4943      1.0708  LogicalTapeWrite
4525      0.9802  LogicalTapeRead
4487      0.9720  LockBuffer
4217      0.9135  heapgettup
3891      0.8429  IndexBuildHeapScan
3862      0.8366  ltsReleaseBlock

It appears that a lot of the cycles blamed on tuplesort_heap_siftup are
due to cache misses associated with referencing memtuples[] entries
that have fallen out of L2 cache.  Not sure how to improve that though.

            regards, tom lane

pgsql-patches by date:

Previous
From: "Luke Lonergan"
Date:
Subject: Re: WIP: further sorting speedup
Next
From: "Luke Lonergan"
Date:
Subject: Re: WIP: further sorting speedup