Thread: More work on SortSupport for text - strcoll() and strxfrm() caching
Since apparently we're back to development work, I thought it was time to share a patch implementing a few additional simple tricks to make sorting text under a non-C locale even faster than in 9.5. These techniques are mostly effective when values are physically clustered together. This might be because there is a physical/logical correlation, but cases involving any kind of clustering of values are helped significantly. Caching ====== The basic idea is that we cache strxfrm() blobs. Separately, we exploit temporal locality and clustering of values by caching the result of the most recent strcoll()-resolved comparison performed. The strxfrm() technique helps a lot with low cardinality single attribute sorts if we can avoid most strxfrm() work. On the other hand, strcoll() comparison caching particularly helps with multi-attribute sorts where there is a low to moderate cardinality secondary attribute and low cardinality leading attribute. The master branch will still opportunistically take the equality memcmp() fastpath plenty of times for that second attribute, but there are no abbreviated keys to help when that doesn't work out (because it isn't the leading attribute). Regressions ========== The patch still helps with strcoll() avoidance when the ordering of a moderate cardinality attribute is totally random, but it helps much less there. I have not seen a regression for any case. I'm expecting someone to ask me to do something with the program I wrote last year, to prove the opportunistic memcmp() equality fastpath for text is "free" [1]. This patch has exactly the same tension as last year's memcmp() equality one [2]: I add something opportunistic, that in general might consistently not work out at all in some cases, and on the face of it implies extra costs for those cases -- costs which must be paid every single time. So as with the opportunistic memcmp() equality thing, the *actual* overhead for cases that do not benefit must be virtually zero for the patch to be worthwhile. That is the standard that I expect that this patch will be held to, too. Benchmark ========= The query that I've been trying this out with is a typical rollup query, using my "cities" sample data [3] (this is somewhat, although not perfectly correlated on (country, province) before sorting): postgres=# select country, province, count(*) from cities group by rollup (country, province); country | province | count -------------------------------------+---------------------------------------+-------- Afghanistan | Badaẖšan | 5 Afghanistan | Bādgīs | 2 Afghanistan | Baġlān | 5 Afghanistan | Balẖ | 6 Afghanistan | Bāmiyān | 3 Afghanistan | Farāh | 3 Afghanistan | Fāryāb | 4 Afghanistan | Ġawr | 3 *** SNIP ***** ... Zimbabwe | Manicaland | 22 Zimbabwe | Mashonaland Central | 13 Zimbabwe | Mashonaland East | 9 Zimbabwe | Mashonaland West | 21 Zimbabwe | Masvingo | 11 Zimbabwe | Matabeleland North | 8 Zimbabwe | Matabeleland South | 14 Zimbabwe | Midlands | 14 Zimbabwe | [null] | 116 [null] | [null] | 317102 (3529 rows) With master, this takes about 525ms when it stabilizes after a few runs on my laptop. With the patch, it takes about 405ms. That's almost a 25% reduction in total run time. If I perform a more direct test of sort performance against this data with minimal non-sorting overhead, I see a reduction of as much as 30% in total query runtime (I chose this rollup query because it is obviously representative of the real world). If this data is *perfectly* correlated (e.g. because someone ran CLUSTER) and some sort can use the dubious "bubble sort best case" path [4] that we added to qsort back in 2006, the improvement still hold up at ~20%, I've found. Performance of the "C" locale --------------------------------------- For this particular rollup query, my 25% improvement leaves the collated text sort perhaps marginally faster than an equivalent query that uses the "C" locale (with or without the patch applied). It's hard to be sure that that effect is real -- many trials are needed -- but it's reasonable to speculate that it's possible to sometimes beat the "C" locale because of factors like final abbreviated key cardinality. It's easy to *contrive* a case where the "C" locale is beaten even with 9.5 -- just sort a bunch of strings (that are abbreviated), that look something like this: "``..,,``..AAA" "``..,,``..CCC" "``..,,``..ZZZ" "``..,,``..BBB" Anyway, this avoidance of strxfrm() work I've introduced makes it possible that abbreviated keys could make a strxfrm() locale-based sort beat the "C" locale fair-and-square with a realistic dataset and specific realistic query. That would be pretty nice, because that can't be too far from optimal, and these cases are not uncommon. A further idea -- unsigned integer comparisons =================================== I've also changed text abbreviated keys to compare as unsigned integers. On my Thinkpad laptop (which, of course, has an Intel CPU), this makes a noticeable difference. memcmp() may be fast, but an unsigned integer comparison is even faster (not sure if a big-endian machine can have the existing memcmp() call optimized away, so that effectively the same thing happens automatically). Maybe other platforms benefit less, but it's very hard to imagine it ever costing us anything. [1] http://www.postgresql.org/message-id/5415A843.3070602@vmware.com [2] Commit e246b3d6 [3] http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/data/cities.dump [4] Commit a3f0b3d6 -- Peter Geoghegan
Attachment
On Fri, Jul 3, 2015 at 8:33 PM, Peter Geoghegan <pg@heroku.com> wrote: > Since apparently we're back to development work, I thought it was time > to share a patch implementing a few additional simple tricks to make > sorting text under a non-C locale even faster than in 9.5. These > techniques are mostly effective when values are physically clustered > together. This might be because there is a physical/logical > correlation, but cases involving any kind of clustering of values are > helped significantly. Interesting work. Some comments: 1. My biggest gripe with this patch is that the comments are not easy to understand. For example: + * Attempt to re-use buffers across calls. Also, avoid work in the event + * of clustered together identical items by exploiting temporal locality. + * This works well with divide-and-conquer, comparison-based sorts like + * quicksort and mergesort. + * + * With quicksort, there is, in general, a pretty strong chance that the + * same buffer contents can be used repeatedly for pivot items -- early + * pivot items will account for large number of total comparisons, since + * they must be compared against many (possibly all other) items. Well, what I would have written is something like: "We're likely to be asked to compare the same strings repeatedly, and memcmp() is so much cheaper than memcpy() that it pays to attempt a memcmp() in the hopes of avoiding a memcpy(). This doesn't seem to slow things down measurably even if it doesn't work out very often." + * While it is worth going to the trouble of trying to re-use buffer + * contents across calls, ideally that will lead to entirely avoiding a + * strcoll() call by using a cached return value. + * + * This optimization can work well again and again for the same set of + * clustered together identical attributes; as they're relocated to new + * subpartitions, only one strcoll() is required for each pivot (in respect + * of that clump of identical values, at least). Similarly, the final + * N-way merge of a mergesort can be effectively accelerated if each run + * has its own locally clustered values. And here I would have written something like: "If we're comparing the same two strings that we compared last time, we can return the same answer without calling strcoll() again. This is more likely than it seems, because quicksort compares the same pivot against many values, and some of those values might be duplicates." Of course everybody may prefer something different here; I'm just telling you what I think. 2. I believe the change to bttextcmp_abbrev() should be pulled out into a separate patch and committed separately. That part seems like a slam dunk. 3. What is the worst case for the strxfrm()-reuse stuff? I suppose it's the case where we have many strings that are long, all equal-length, and all different, but only in the last few characters. Then the memcmp() is as expensive as possible but never works out. How does the patch hold up in that case? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Interesting work. Thanks. > 1. My biggest gripe with this patch is that the comments are not easy > to understand. > Of course everybody may prefer something different here; I'm just > telling you what I think. I have struggled with trying to put just the right amount of exposition on the theory behind a particular optimization in source code comments, and things like that. Since no one is telling me that I need to write more, clearly I don't have the balance right yet. To a certain extent, it is a matter of personal style, but I'll try and be more terse. > 2. I believe the change to bttextcmp_abbrev() should be pulled out > into a separate patch and committed separately. That part seems like > a slam dunk. Makes sense. > 3. What is the worst case for the strxfrm()-reuse stuff? I suppose > it's the case where we have many strings that are long, all > equal-length, and all different, but only in the last few characters. > Then the memcmp() is as expensive as possible but never works out. > How does the patch hold up in that case? I haven't tested it. I'll get around to it at some point in the next couple of weeks. I imagine that it's exactly the same as the memcmp() equality thing because of factors like speculative execution, and the fact that we need both strings in cache anyway. It's almost exactly the same story, although unlike the memcmp() opportunistic equality pre-check thing, this check happens only n times, not n log n times. I'm quite sure that the cost needs to be virtually zero to go ahead with the idea. I think it probably is. Note that like the memcmp() thing, we check string length first, before a memcmp(). -- Peter Geoghegan
On Tue, Aug 4, 2015 at 1:30 PM, Peter Geoghegan <pg@heroku.com> wrote: >> 2. I believe the change to bttextcmp_abbrev() should be pulled out >> into a separate patch and committed separately. That part seems like >> a slam dunk. > > Makes sense. BTW, I want to put the string_uint() macro in a common header now. It can be used for other types. I've written a SortSupport + abbreviated keys patch for the UUID type which will use it, too, so that it too can use simple unsigned integer comparisons within its abbreviated comparator. I haven't posted the UUID patch yet only because everyone is up to their ears in my sorting patches. -- Peter Geoghegan
On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Some comments: I attach a new version of the patch series that incorporates all your feedback. The patch series is now made cumulative in a way that makes it easy for someone to independently commit the unsigned integer comparison optimization for text, and nothing else. The macro that uses is in a dedicated header now, because I have another patch (SortSupport for the UUID type) that uses the same optimization for the same reason. It seems like something that will probably end up with a third or forth client before too long, so I think the byte swap macro wrapper belongs in sortsupport.h. BTW, I think that in practice the merge phase of a tape sort isn't much helped by comparison caching, contrary to comments appearing in the original version. The heap data structure used by polyphase merge has bad properties around locality (both temporal and spatial). I'm thinking about independently addressing that problem. I now make no claims about it in this patch. -- Peter Geoghegan
Attachment
On Sun, Oct 4, 2015 at 2:17 AM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Aug 4, 2015 at 12:41 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Some comments: > > I attach a new version of the patch series that incorporates all your > feedback. The patch series is now made cumulative in a way that makes > it easy for someone to independently commit the unsigned integer > comparison optimization for text, and nothing else. The macro that > uses is in a dedicated header now, because I have another patch > (SortSupport for the UUID type) that uses the same optimization for > the same reason. It seems like something that will probably end up > with a third or forth client before too long, so I think the byte swap > macro wrapper belongs in sortsupport.h. Reviewing 0001, I'm happy to see us add bswap64, but I'm not sure we should put it in c.h, because that's included by absolutely everything. How about putting it in a new #include inside src/port, like src/port/pg_bswap.h? Then pg_crc.h can include that, but other things can, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Oct 6, 2015 at 1:07 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Reviewing 0001, I'm happy to see us add bswap64, but I'm not sure we > should put it in c.h, because that's included by absolutely > everything. How about putting it in a new #include inside src/port, > like src/port/pg_bswap.h? Then pg_crc.h can include that, but other > things can, too. I guess I imagined that bswap64() was fundamental infrastructure, but on second thought that's not actually in evidence -- it is not already needed in plenty of other places. So yeah, works for me. -- Peter Geoghegan
On Tue, Oct 6, 2015 at 4:09 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Oct 6, 2015 at 1:07 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> Reviewing 0001, I'm happy to see us add bswap64, but I'm not sure we >> should put it in c.h, because that's included by absolutely >> everything. How about putting it in a new #include inside src/port, >> like src/port/pg_bswap.h? Then pg_crc.h can include that, but other >> things can, too. > > I guess I imagined that bswap64() was fundamental infrastructure, but > on second thought that's not actually in evidence -- it is not already > needed in plenty of other places. So yeah, works for me. If you would care to revise the patch accordingly, I will commit it (barring objections from others, of course). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I guess I imagined that bswap64() was fundamental infrastructure, but >> on second thought that's not actually in evidence -- it is not already >> needed in plenty of other places. So yeah, works for me. > > If you would care to revise the patch accordingly, I will commit it > (barring objections from others, of course). Sure. It might take me a couple of days to get around to it, though -- things are a bit hectic here. Thanks -- Peter Geoghegan
On Tue, Oct 6, 2015 at 4:26 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> I guess I imagined that bswap64() was fundamental infrastructure, but >>> on second thought that's not actually in evidence -- it is not already >>> needed in plenty of other places. So yeah, works for me. >> >> If you would care to revise the patch accordingly, I will commit it >> (barring objections from others, of course). > > Sure. It might take me a couple of days to get around to it, though -- > things are a bit hectic here. I know the feeling. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas <robertmhaas@gmail.com> wrote: > If you would care to revise the patch accordingly, I will commit it > (barring objections from others, of course). Here is a revision of 0001-*, with both BSWAP32() and BSWAP64() in a new header, src/port/pg_bswap.h. No revisions were required to any other patch in the patch series to make this work, and so I only include a revised 0001-*. -- Peter Geoghegan
Attachment
On Wed, Oct 7, 2015 at 8:09 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Tue, Oct 6, 2015 at 1:16 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> If you would care to revise the patch accordingly, I will commit it >> (barring objections from others, of course). > > Here is a revision of 0001-*, with both BSWAP32() and BSWAP64() in a > new header, src/port/pg_bswap.h. > > No revisions were required to any other patch in the patch series to > make this work, and so I only include a revised 0001-*. Great. I've committed that, minus the sortsupport.h changes which I think should be part of 0002, and which in any case I'd like to discuss a bit more. It seems to me that (1) ABBREV_STRING_UINT isn't a great name for this and (2) the comment is awfully long for the thing to which it refers. I suggest that we instead call it DatumToBigEndian(), put it pg_bswap.h, and change the comments to something like this: /** Rearrange the bytes of a Datum into big-endian order.** One possible application of this macro is to make comparisons cheaper. An integer* comparison of the new Datums will return the same result as a memcmp() on the* original Datums, butthe integer comparison should be much cheaper.*/ The specific way that this is used by various sortsupport routines can be adequately explained in the comments for those routines. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 8, 2015 at 10:13 AM, Robert Haas <robertmhaas@gmail.com> wrote: > It seems to me that (1) ABBREV_STRING_UINT isn't > a great name for this and (2) the comment is awfully long for the > thing to which it refers. I suggest that we instead call it > DatumToBigEndian(), put it pg_bswap.h, and change the comments to > something like this: > > /* > * Rearrange the bytes of a Datum into big-endian order. > * > * One possible application of this macro is to make comparisons > cheaper. An integer > * comparison of the new Datums will return the same result as a memcmp() on the > * original Datums, but the integer comparison should be much cheaper. > */ > > The specific way that this is used by various sortsupport routines can > be adequately explained in the comments for those routines. This is pretty clearly something specific to SortSupport. I'm not opposed to changing the name and making the comments more terse along those lines, but I think it should live in sortsupport.h. The macro byteswaps datums on little-endian platforms only, which seems very specific. I think that we're going to have SortSupport with abbreviation for UUIDs and bytea at some point, and maybe character(n). Centralizing information about this to sortsupport.h makes sense to me. -- Peter Geoghegan
On Thu, Oct 8, 2015 at 2:07 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Thu, Oct 8, 2015 at 10:13 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> It seems to me that (1) ABBREV_STRING_UINT isn't >> a great name for this and (2) the comment is awfully long for the >> thing to which it refers. I suggest that we instead call it >> DatumToBigEndian(), put it pg_bswap.h, and change the comments to >> something like this: >> >> /* >> * Rearrange the bytes of a Datum into big-endian order. >> * >> * One possible application of this macro is to make comparisons >> cheaper. An integer >> * comparison of the new Datums will return the same result as a memcmp() on the >> * original Datums, but the integer comparison should be much cheaper. >> */ >> >> The specific way that this is used by various sortsupport routines can >> be adequately explained in the comments for those routines. > > This is pretty clearly something specific to SortSupport. I'm not > opposed to changing the name and making the comments more terse along > those lines, but I think it should live in sortsupport.h. The macro > byteswaps datums on little-endian platforms only, which seems very > specific. > > I think that we're going to have SortSupport with abbreviation for > UUIDs and bytea at some point, and maybe character(n). Centralizing > information about this to sortsupport.h makes sense to me. I'm not convinced. Doesn't this exact same concept get used for over-the-wire communication between BE and LE machines? There, this operation is spelled htonl/ntohl. Some systems even have htonll, but I'm sure there are still a bunch that don't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 8, 2015 at 11:37 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I'm not convinced. Doesn't this exact same concept get used for > over-the-wire communication between BE and LE machines? There, this > operation is spelled htonl/ntohl. Some systems even have htonll, but > I'm sure there are still a bunch that don't. I continue to disagree with that. The spelling of the macro that you propose suggests that this process occurs at a relatively high level of abstraction, which is misleading. Datums that have abbreviated key bytes packed into them are in general kind of special. All the same, here is a revision of the patch series along those lines. I'll also have to update the UUID patch to independently note the same issues. I should point out that I did not call the macro DatumToBigEndian(), because it's actually the other way around. I called it DatumToLittleEndian(), since the unsigned integer comparator would work correctly on big-endian systems without calling any new macro (which is of course why the new macro does nothing on big-endian systems). We start off with a big endian Datum/unsigned integer on all platforms, and then we byteswap it to make it a little-endian unsigned integer if and when that's required (i.e. only on little-endian systems). -- Peter Geoghegan
Attachment
On Thu, Oct 8, 2015 at 8:20 PM, Peter Geoghegan <pg@heroku.com> wrote: > I should point out that I did not call the macro DatumToBigEndian(), > because it's actually the other way around. I called it > DatumToLittleEndian(), since the unsigned integer comparator would > work correctly on big-endian systems without calling any new macro > (which is of course why the new macro does nothing on big-endian > systems). We start off with a big endian Datum/unsigned integer on all > platforms, and then we byteswap it to make it a little-endian unsigned > integer if and when that's required (i.e. only on little-endian > systems). Hmm. But then this doesn't seem to make much sense: + * Rearrange the bytes of a Datum into little-endian order from big-endian + * order. On big-endian machines, this does nothing at all. Rearranging bytes into little-endian order ought to be a no-op on a little-endian machine; and rearranging them into big-endian order ought to be a no-op on a big-endian machine. Thinking about this a bit more, it seems like the situation we're in here is that the input datum is always going to be big-endian. Regardless of what the machine's integer format is, the sortsupport abbreviator is going to output a Datum where the most significant byte is the first one stored in the datum. We want to convert that Datum to one that has *native* endianness. So maybe we should call this DatumBigEndianToNative or something like that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 9, 2015 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Hmm. But then this doesn't seem to make much sense: > > + * Rearrange the bytes of a Datum into little-endian order from big-endian > + * order. On big-endian machines, this does nothing at all. > > Rearranging bytes into little-endian order ought to be a no-op on a > little-endian machine; and rearranging them into big-endian order > ought to be a no-op on a big-endian machine. I think that that's very clearly implied anyway. > Thinking about this a bit more, it seems like the situation we're in > here is that the input datum is always going to be big-endian. > Regardless of what the machine's integer format is, the sortsupport > abbreviator is going to output a Datum where the most significant byte > is the first one stored in the datum. We want to convert that Datum > to one that has *native* endianness. So maybe we should call this > DatumBigEndianToNative or something like that. I'd be fine with DatumBigEndianToNative() -- I agree that that's slightly better. -- Peter Geoghegan
On Fri, Oct 9, 2015 at 2:48 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Oct 9, 2015 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Hmm. But then this doesn't seem to make much sense: >> >> + * Rearrange the bytes of a Datum into little-endian order from big-endian >> + * order. On big-endian machines, this does nothing at all. >> >> Rearranging bytes into little-endian order ought to be a no-op on a >> little-endian machine; and rearranging them into big-endian order >> ought to be a no-op on a big-endian machine. > > I think that that's very clearly implied anyway. > >> Thinking about this a bit more, it seems like the situation we're in >> here is that the input datum is always going to be big-endian. >> Regardless of what the machine's integer format is, the sortsupport >> abbreviator is going to output a Datum where the most significant byte >> is the first one stored in the datum. We want to convert that Datum >> to one that has *native* endianness. So maybe we should call this >> DatumBigEndianToNative or something like that. > > I'd be fine with DatumBigEndianToNative() -- I agree that that's > slightly better. OK, committed that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 9, 2015 at 12:11 PM, Robert Haas <robertmhaas@gmail.com> wrote: > OK, committed that way. Thank you. -- Peter Geoghegan
On Fri, Oct 9, 2015 at 12:11 PM, Robert Haas <robertmhaas@gmail.com> wrote: > OK, committed that way. Just for the record, with the same "cities" table as my original post to this thread, this query: select count(distinct(city)) from cities; Goes from taking about 296ms (once it stabilizes), to about 265ms (once it stabilizes) following today's commit of just the unsigned integer comparison patch. I've shaved just over 10% off the duration of this representative sort-heavy query (against a 9.5 baseline), which is nice. -- Peter Geoghegan
On Fri, Oct 9, 2015 at 3:33 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Oct 9, 2015 at 12:11 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> OK, committed that way. > > Thank you. You're welcome. After some study and experimentation, I've committed the other part also. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 9, 2015 at 4:39 PM, Robert Haas <robertmhaas@gmail.com> wrote: > You're welcome. After some study and experimentation, I've committed > the other part also. Fantastic. I guess the precedent of the 9.5 text equality fast path made this discussion way smoother than last time, since essentially the same principle applies. -- Peter Geoghegan
On Fri, Oct 9, 2015 at 7:49 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Fri, Oct 9, 2015 at 4:39 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> You're welcome. After some study and experimentation, I've committed >> the other part also. > > Fantastic. I guess the precedent of the 9.5 text equality fast path > made this discussion way smoother than last time, since essentially > the same principle applies. I think that is true. I spent some time thinking about whether the way you used INT_MIN as a sentinel value should be changed around somehow, but ultimately I decided that it wasn't too bad and that suggesting something else would be pointless kibitzing. I also tried to think of scenarios in which this would lose, and I'm not totally convinced that there aren't any, but I'm convinced that, if they exist, I don't know what they are. Since the patch did deliver a small improvement on my test cases and on yours, I think we might as well have it in the tree. If some pathological scenario shows up where it turns out to hurt, we can always fix it then, or revert if it need be. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Oct 9, 2015 at 5:54 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I think that is true. I spent some time thinking about whether the > way you used INT_MIN as a sentinel value should be changed around > somehow, but ultimately I decided that it wasn't too bad and that > suggesting something else would be pointless kibitzing. I also tried > to think of scenarios in which this would lose, and I'm not totally > convinced that there aren't any, but I'm convinced that, if they > exist, I don't know what they are. Since the patch did deliver a > small improvement on my test cases and on yours, I think we might as > well have it in the tree. If some pathological scenario shows up > where it turns out to hurt, we can always fix it then, or revert if it > need be. That seems very reasonable. I noticed that there is still one comment that I really should have removed as part of this work. The comment didn't actually add any new information for 9.5, but is now obsolete. Attached patch removes it entirely. -- Peter Geoghegan
Attachment
On Sat, Oct 10, 2015 at 12:32 PM, Peter Geoghegan <pg@heroku.com> wrote: > I noticed that there is still one comment that I really should have > removed as part of this work. I also noticed that I failed to reset the last_returned strcoll() cache variable as part of an abbreviation call, despite the fact that tapesort may freely interleave conversions with comparisons, while reusing buf1 and buf2 both as scratch space for strxfrm() blobs, as well as for storing strings to be compared with strcoll(). I suggest that the attached patch also be applied to fix this issue. -- Peter Geoghegan
Attachment
On Mon, Oct 12, 2015 at 12:47 AM, Peter Geoghegan <pg@heroku.com> wrote: > I also noticed that I failed to reset the last_returned strcoll() > cache variable as part of an abbreviation call, despite the fact that > tapesort may freely interleave conversions with comparisons, while > reusing buf1 and buf2 both as scratch space for strxfrm() blobs, as > well as for storing strings to be compared with strcoll(). I suggest > that the attached patch also be applied to fix this issue. I think that I jumped the gun with this fix, because theoretically you can still get the same problem in the opposite direction -- an original string treated as a strxfrm() blob when the cache is consulted. I'll consider a more comprehensive fix. -- Peter Geoghegan
On Mon, Oct 12, 2015 at 3:31 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Oct 12, 2015 at 12:47 AM, Peter Geoghegan <pg@heroku.com> wrote: >> I also noticed that I failed to reset the last_returned strcoll() >> cache variable as part of an abbreviation call, despite the fact that >> tapesort may freely interleave conversions with comparisons, while >> reusing buf1 and buf2 both as scratch space for strxfrm() blobs, as >> well as for storing strings to be compared with strcoll(). I suggest >> that the attached patch also be applied to fix this issue. > > I think that I jumped the gun with this fix, because theoretically you > can still get the same problem in the opposite direction -- an > original string treated as a strxfrm() blob when the cache is > consulted. > > I'll consider a more comprehensive fix. This is not the first time I've committed one of your patches and promptly received a whole series of emails with fixes for what went into that commit, despite the fact that I have not and did not change the relevant parts of the patch while committing. Since that makes more work for me, I am not a huge fan, and the practical effect is to subtract from the amount of time that could otherwise have been spent reviewing your next patch (or someone else's). In this case, I think the best thing for me to do right now is wait to commit anything further until you have had a chance to go over this and come up with a fix or set of fixes that you think are completely, 100% ready to go; or else until you get to the point of wanting feedback before proceeding further. In general, I would appreciate if this sort of post-commit self-review could be done pre-commit. I apologize for the fact that this email probably sounds grouchy. Please try to read it in the most positive light possible (and don't shoot me). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Oct 12, 2015 at 4:15 PM, Robert Haas <robertmhaas@gmail.com> wrote: > In this case, I think > the best thing for me to do right now is wait to commit anything > further until you have had a chance to go over this and come up with a > fix or set of fixes that you think are completely, 100% ready to go; > or else until you get to the point of wanting feedback before > proceeding further. In general, I would appreciate if this sort of > post-commit self-review could be done pre-commit. This came to me while I was making dinner last night - I was not actually poring over the code on my computer. I don't know why I thought of it then and not at any earlier point, but it seems reasonable to suppose it had something to do with perspective, or being able to see a bigger picture that is difficult to keep in mind during initial development and self review. I was mostly thinking about external sorting at the time, and not this patch. I cannot do that on demand. It isn't a matter of effort. When I come back from vacation at the end of the month, I may be able to do somewhat better for a few months. -- Peter Geoghegan
On Mon, Oct 12, 2015 at 12:31 PM, Peter Geoghegan <pg@heroku.com> wrote: > I'll consider a more comprehensive fix. I attach a revised fix that considers the problem of misinterpreting the contents of the buffers in both directions. Thanks -- Peter Geoghegan
Attachment
On Wed, Oct 14, 2015 at 7:05 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Mon, Oct 12, 2015 at 12:31 PM, Peter Geoghegan <pg@heroku.com> wrote: >> I'll consider a more comprehensive fix. > > I attach a revised fix that considers the problem of misinterpreting > the contents of the buffers in both directions. I wonder if it wouldn't be better to just add a separate Boolean indicating exactly the thing we care about. This doesn't seem particularly easy to understand and verify. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 14, 2015 at 4:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: > I wonder if it wouldn't be better to just add a separate Boolean > indicating exactly the thing we care about. This doesn't seem > particularly easy to understand and verify. I'm not really sure that that's an improvement. But I defer to you. -- Peter Geoghegan
On Wed, Oct 14, 2015 at 7:11 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Wed, Oct 14, 2015 at 4:09 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> I wonder if it wouldn't be better to just add a separate Boolean >> indicating exactly the thing we care about. This doesn't seem >> particularly easy to understand and verify. > > I'm not really sure that that's an improvement. But I defer to you. Would you be willing to try coding it up that way so we can see how it looks? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Oct 15, 2015 at 9:07 AM, Robert Haas <robertmhaas@gmail.com> wrote: > Would you be willing to try coding it up that way so we can see how it looks? I attach a patch that does it that way. Note that I will be away for until late this month. Do not expect a response from me before then, unless it's truly urgent. Thanks -- Peter Geoghegan
Attachment
On Sun, Oct 18, 2015 at 2:52 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Thu, Oct 15, 2015 at 9:07 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Would you be willing to try coding it up that way so we can see how it looks? > > I attach a patch that does it that way. That looks good to me. I've committed this, and the other patch to remove the obsolete comment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company