Re: Tuning current tuplesort external sort code for 8.2 - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Tuning current tuplesort external sort code for 8.2 |
Date | |
Msg-id | 1128417400.8603.239.camel@localhost.localdomain Whole thread Raw |
In response to | Re: Tuning current tuplesort external sort code for 8.2 (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Tue, 2005-10-04 at 00:21 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > AFAICS the following opportunities exist, without changing any of the > > theoretical algorithms or the flexibility of definable datatypes: > > > 1. tuplesort_heap_siftup and tuplesort_heap_insert make no attempt to > > cache the values of keys that have been obtained from *_getattr macros. > > The two routines navigate a tournament sort heap, so that on average 50% > > of comparisons use at least one immediately preceeding tuple and key > > values from that could be cached ready for the next call. > > Hmm ... this seems interesting, but you also have to look at the > potential downside: what is the cost of doing the caching? Actually, nothing for the cache-1-tuple case. We just re-initialise the fcinfo based around whether we are caching 0,1 or 2 fields. We keep two fcinfo structures, so that when we cache=0 fields we do not pollute the cache for the cache=1 case. tuplesort_heap_insert allows us to cache one value each time round the inner loop, always using fcinfo1. (cache=1). tuplesort_heap_siftup has two comparisons per iteration; it allows us to cache 0 values on the first call, so we use fcinfo0. We can always cache one value for the second call, exactly same as _heap_insert case, so we use fcinfo1. Maybe we can also reuse the winner from the first call as the second input to the second call. ehem...It's easier to see when you look at the code. We need to work a bit harder to get the cache=2 case but its possible, though I take your point that it may not be worth it. I'll do the cache=1 case before attempting the cache=2 case, so we can measure the gain. We run _heap_insert and _heap_siftup once for each incoming tuple, so the saving should be good. > > All of the remaining ideas relate to NULL handling. > > I can't get excited about this. Most sort scenarios involve few if any > nulls. The idea would be much less important anyway if you follow up on this next idea: > One thought that comes to mind is that the current system structure > encourages sorting on keys that are at the end of their tuples. > For instance, "SELECT foo, bar FROM ... ORDER BY baz" will sort by > the third column of the generated tuples, which is certainly the least > efficient to access. Not sure what goes on in there. Are you saying heap sort cols are always at the end, or only in the cases you mention? What about where we're doing a GROUP BY, so all the sort columns are always part of the select list and frequently (by coding convention) at the front of the row? If the cols aren't always in resjunk we can come up with some performance test cases to check whether or not this is a problem, how big and where it starts to show itself. > It'd be interesting to look into generating the > working tuples with baz as the first column. I fear this is nontrivial, > because there are a lot of places that expect resjunk columns to come > last, but we should study it. It could be worth it to arrange the main select's attr list so that the sort keys always come first. I'd settle just for that if the second part is too much work. It sounds like hard work to optimise the case where the ORDER BY is on columns not mentioned in the select. The latter seems worth it for when we do a sort-merge based on join columns not mentioned in the SELECT, but not for optimizing sort-aggregate or one-table-select cases. > (Note: this will do nada for Josh's > original complaint about index build time, since btree index sorts will > by definition use all the tuple's columns, but it seems like a > significant issue for ordinary sorts.) Understood, but we do care about heap sorts too! Heap and index sorts have many dissimilarities, but both are important, so perhaps we should tune for each case separately. Best Regards, Simon Riggs
pgsql-hackers by date: