Re: More work on SortSupport for text - strcoll() and strxfrm() caching - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: More work on SortSupport for text - strcoll() and strxfrm() caching |
Date | |
Msg-id | CA+Tgmoa=PmK=myKYvF_UgWq3hiWgm94jRKAxTGVy1chjBqBi1w@mail.gmail.com Whole thread Raw |
In response to | More work on SortSupport for text - strcoll() and strxfrm() caching (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: More work on SortSupport for text - strcoll() and
strxfrm() caching
Re: More work on SortSupport for text - strcoll() and strxfrm() caching |
List | pgsql-hackers |
On Fri, Jul 3, 2015 at 8:33 PM, Peter Geoghegan <pg@heroku.com> wrote: > Since apparently we're back to development work, I thought it was time > to share a patch implementing a few additional simple tricks to make > sorting text under a non-C locale even faster than in 9.5. These > techniques are mostly effective when values are physically clustered > together. This might be because there is a physical/logical > correlation, but cases involving any kind of clustering of values are > helped significantly. Interesting work. Some comments: 1. My biggest gripe with this patch is that the comments are not easy to understand. For example: + * Attempt to re-use buffers across calls. Also, avoid work in the event + * of clustered together identical items by exploiting temporal locality. + * This works well with divide-and-conquer, comparison-based sorts like + * quicksort and mergesort. + * + * With quicksort, there is, in general, a pretty strong chance that the + * same buffer contents can be used repeatedly for pivot items -- early + * pivot items will account for large number of total comparisons, since + * they must be compared against many (possibly all other) items. Well, what I would have written is something like: "We're likely to be asked to compare the same strings repeatedly, and memcmp() is so much cheaper than memcpy() that it pays to attempt a memcmp() in the hopes of avoiding a memcpy(). This doesn't seem to slow things down measurably even if it doesn't work out very often." + * While it is worth going to the trouble of trying to re-use buffer + * contents across calls, ideally that will lead to entirely avoiding a + * strcoll() call by using a cached return value. + * + * This optimization can work well again and again for the same set of + * clustered together identical attributes; as they're relocated to new + * subpartitions, only one strcoll() is required for each pivot (in respect + * of that clump of identical values, at least). Similarly, the final + * N-way merge of a mergesort can be effectively accelerated if each run + * has its own locally clustered values. And here I would have written something like: "If we're comparing the same two strings that we compared last time, we can return the same answer without calling strcoll() again. This is more likely than it seems, because quicksort compares the same pivot against many values, and some of those values might be duplicates." Of course everybody may prefer something different here; I'm just telling you what I think. 2. I believe the change to bttextcmp_abbrev() should be pulled out into a separate patch and committed separately. That part seems like a slam dunk. 3. What is the worst case for the strxfrm()-reuse stuff? I suppose it's the case where we have many strings that are long, all equal-length, and all different, but only in the last few characters. Then the memcmp() is as expensive as possible but never works out. How does the patch hold up in that case? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: