Re: B-Tree support function number 3 (strxfrm() optimization) - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: B-Tree support function number 3 (strxfrm() optimization) |
Date | |
Msg-id | CA+TgmobUgp7J0m=dRj92Y+eZKfzvvzNaFtBQcd9nP3Atzh0+iw@mail.gmail.com Whole thread Raw |
In response to | Re: B-Tree support function number 3 (strxfrm() optimization) (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: B-Tree support function number 3 (strxfrm() optimization)
Re: B-Tree support function number 3 (strxfrm() optimization) |
List | pgsql-hackers |
On Sun, Nov 9, 2014 at 10:02 PM, Peter Geoghegan <pg@heroku.com> wrote: > On Sat, Oct 11, 2014 at 6:34 PM, Peter Geoghegan <pg@heroku.com> wrote: >> Attached patch, when applied, accelerates all tuplesort cases using >> abbreviated keys, building on previous work here, as well as the patch >> posted to that other thread. > > I attach an updated patch set, rebased on top of the master branch's > tip. All relevant tuplesort cases (B-Tree, MinimalTuple, CLUSTER) are > now directly covered by the patch set, since there is now general > sortsupport support for those cases in the master branch -- no need to > apply some other patch from some other thread. > > For the convenience of reviewers, this new revision includes a new > approach to making my improvements cumulative: A second commit adds > tuple count estimation. This hint, passed along to the text opclass's > convert routine, is taken from the optimizer's own estimate, or the > relcache's reltuples, depending on the tuplesort case being > accelerated. As in previous revisions, the idea is to give the opclass > a sense of proportion about how far along it is, to be weighed in > deciding whether or not to abort abbreviation. One potentially > controversial aspect of that is how the text opclass abbreviation cost > model/abort early stuff weighs simply having many tuples - past a > certain point, it *always* proceeds with abbreviation, not matter what > the cardinality of abbreviated keys so far is. For that reason it > particular, it seemed to make sense to split these parts out into a > second commit. > > I hope that we can finish up all 9.5 work on accelerated sorting soon. There's a lot of stuff in this patch I'm still trying to digest, but here are a few thoughts on patch 0001: - This appears to needlessly reindent the comments for PG_CACHE_LINE_SIZE. - I really don't think we need a #define in pg_config_manual.h for this. Please omit that. - I'm much happier with the way the changes to sortsupport.h look in this version. However, I think that auth_comparator is a confusing name, because "auth" is often used as an abbreviation for "authentication". We can spell it out (authoritative_comparator) or come up with a different name (backup_comparator? abbrev_full_comparator?). Whatever we do, ApplySortComparatorAuth() should be renamed to match. - Also, I don't think making abbrev_state an enumerated value with two values is really doing anything for us; we could just use a Boolean. I'm wondering if we should actually go a bit further and remove this from the SortSupport object and instead add an additional Boolean flag to PrepareSortSupportFrom(OrderingOp|IndexRel) that gets passed all the way down to the opclass's sortsupport function. It seems like that might be more clear. Once the opclass function has done its thing, the other three new nembers are enough to know whether we're using the optimization or not (and can be fiddled if we want to make a later decision to call the whole thing off). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: