WIP: further sorting speedup - Mailing list pgsql-patches

From Tom Lane
Subject WIP: further sorting speedup
Date
Msg-id 15464.1140403246@sss.pgh.pa.us
Whole thread Raw
Responses Re: WIP: further sorting speedup
Re: WIP: further sorting speedup
Re: WIP: further sorting speedup
List pgsql-patches
After applying Simon's recent sort patch, I was doing some profiling and
noticed that sorting spends an unreasonably large fraction of its time
extracting datums from tuples (heap_getattr or index_getattr).  The
attached patch does something about this by pulling out the leading sort
column of a tuple when it is received by the sort code or re-read from a
"tape".  This increases the space needed by 8 or 12 bytes (depending on
sizeof(Datum)) per in-memory tuple, but doesn't cost anything as far as
the on-disk representation goes.  The effort needed to extract the datum
at this point is well repaid because the tuple will normally undergo
multiple comparisons while it remains in memory.  In some quick tests
the patch seemed to make for a significant speedup, on the order of 30%,
despite increasing the number of runs emitted because of the smaller
available memory.

The choice to pull out just the leading column, rather than all columns,
is driven by concerns of (a) code complexity and (b) memory space.
Having the extra columns pre-extracted wouldn't buy anything anyway
in the common case where the leading key determines the result of
a comparison.

This is still WIP because it leaks memory intra-query (I need to fix it
to clean up palloc'd space better).  I thought I'd post it now in case
anyone wants to try some measurements for their own favorite test cases.
In particular it would be interesting to see what happens for a
multi-column sort with lots of duplicated keys in the first column,
which is the case where the least advantage would be gained.

Comments?

            regards, tom lane


Attachment

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: ScanDirections
Next
From: James William Pye
Date:
Subject: Re: ScanDirections