Re: More ideas for speeding up sorting - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: More ideas for speeding up sorting |
Date | |
Msg-id | CAM3SWZTXfOx48Ub7Ytp8Cb4q3KSy+oGXRfUtvtf5dko1qRTG_A@mail.gmail.com Whole thread Raw |
In response to | More ideas for speeding up sorting (Heikki Linnakangas <hlinnaka@iki.fi>) |
Responses |
Re: More ideas for speeding up sorting
Re: More ideas for speeding up sorting |
List | pgsql-hackers |
On Fri, Sep 9, 2016 at 6:14 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > A few things caught my eye while hacking on the latest round of sorting > patches. This is how it begins. :-) > Starting a new thread because these are orthogonal to the things > discussed on the other threads: > > 1. SortTuple.tupindex is not used when the sort fits in memory. If we also > got rid of the isnull1 flag, we could shrink SortTuple from 24 bytes to 16 > bytes (on 64-bit systems). That would allow you to pack more tuples into > memory, which generally speeds things up. We could for example steal the > least-significant bit from the 'tuple' pointer for that, because the > pointers are always at least 4-byte aligned. (But see next point) I've wanted to do that for a long time. I'd rather make SortTuples 16 bytes and be done with it, rather than introduce variations, mostly because I now think it's possible without any real downside. Maybe your preread patch gets us to a place where the tupindex field is useless, because there is no chaining of SortTuples during merging, no free list of the "slot" SortTuples, and so on. Maybe we could kill replacement selection fully, so it no longer needs tupindex, leaving us with no need for it entirely (not just when everything still fits in memory). But even if we don't kill replacement selection entirely, we still only need one bit these days, since in practice there are only two distinct run numbers ever represented in tupindex these days (RUN_FIRST, and HEAP_RUN_NEXT). We can spare a second bit from the "tuple" pointer, I suppose. I know that in practice, even virtual address space is significantly less than 64-bits on x86_64, so I think that there should be several bits there for the taking, even if the addresses are not aligned (although, they are). However, I have no idea if there is any precedent for this general idea at all, and would feel better about it if there was. I guess we don't have to make it any more complicated than your argument about alignment. > 2. We could easily categorize incoming tuples as the come in, into those > that have a leading NULL, and others. If we kept them in separate arrays, or > perhaps grow memtuples from bottom-up for non-NULLs and from top-down for > NULLs, we wouldn't need to store, or compare, the 'isnull1' flag at all, in > the quicksort.' We also wouldn't need to use datum1 to store nothing, at least in the "state.tuples" (non-datum-pass-by-value) case. Where the leading attribute is NULL, we can "discard" it, and sort those tuples separately, and return them to caller first or last as required. > 3. In the single-Datum case, i.e. tuplesort_begin_datum(), we could just > keep a counter of how many NULLs we saw, and not store them at all. That's > quite a special case, but it's simple enough that it might be worth it. A lot hinges on the performance of this special case. For example, CREATE INDEX CONCURRENTLY is quite sensitive to it in practice, since the TID sort is (even in 9.6) the major reason for it being slower than plain CREATE INDEX (I think that the second pass over the heap is less important most of the time -- you've seen how sorting isn't I/O bound much of the time for yourself, so you can imagine). Even still, I'd be inclined to try to get SortTuples down to 16 bytes, at which point this special case hardly needs to be considered (we just skip the quicksort iff it's a single attribute sort, which includes MinimalTuple cases where there happens to only be one attribute that we sort on). -- Peter Geoghegan
pgsql-hackers by date: