Re: GIT patch - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: GIT patch |
Date | |
Msg-id | 46B18D53.5010700@enterprisedb.com Whole thread Raw |
In response to | Re: GIT patch (Alvaro Herrera <alvherre@commandprompt.com>) |
Responses |
Re: GIT patch
Re: GIT patch |
List | pgsql-hackers |
Alvaro Herrera wrote: > Hmm, do say, doesn't it seem like the lack of feedback and the failed > bitmap patch played against final development of this patch? Yes :(. That's a one reason why I tried to help with the review of that patch. > At this > point I feel like the patch still needs some work and reshuffling before > it is in an acceptable state. The fact that there are some API changes > for which the patch needs to be adjusted makes me feel like we should > put this patch on hold for 8.4. So we would first get the API changes > discussed and done and then adapt this patch to them. I hate to say it but I agree. Getting the API changes discussed and committed was my plan in February/March. Unfortunately it didn't happen back then. There's a few capabilities we need from the new API: 1. Support for candidate matches. Because a clustered index doesn't contain the key for every heap tuple, when you search for a value we don't know exactly which ones match. Instead, you get a bunch of candidate matches, which need to be rechecked after fetching the heap tuple. Oleg & Teodor pointed out that GiST could use the capability as well. I also proposed a while ago to change the hash index implementation so that it doesn't store the index key in the index, but just the hash of it. That again would need the support for candidate matches. And there's range-encoded bitmap indexes, if we implement them in a more distant future. 2. Support to sort the heap tuples represented by one index tuple, in normal index scans, if we go with alternative 1. Or support to do binary searches over them, if we go with alternative 2 or 3. As the patch stands, the sorting is done within b-tree, but that's quite ugly. > Of the three proposals you suggest, I think the first one > >> 1. A grouped index tuple contains a bitmap of offsetnumbers, >> representing a bunch of heap tuples stored on the same heap page, that >> all have a key between the key stored on the index tuple and the next >> index tuple. We don't keep track of the ordering of the heap tuples >> represented by one group index tuple. When doing a normal, non-bitmap, >> index scan, they need to be sorted. This is what the patch currently >> implements. > > makes the most sense -- the index is keep simple and fast, and doing the > sorting during an indexscan seems a perfectly acceptable compromise, > knowing that the amount of tuples possible returned for sort is limited > by the heap blocksize. The overhead is shown in the CPU test results, particularly in the select_range* tests, I put up on the git web site: http://community.enterprisedb.com/git/. The other alternatives might be simpler, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: