Thread: Todo item: Support amgettuple() in GIN
Hi, I decided to look into how much work implementing the todo item about supporting amgettuple in GIN would be, since exclusion constraints on GIN would be neat. Robert Haas suggested a solution[1], but to fix it we also need to look into why the commit message mentions that it did not work anyway with the partial matches. So I looked into that first, and here is my explanation for why it is broken for partial matches. I am sending this to the mailing list to check if I am correct and if so update the todo list with this new information. = Explanation When doing normal matching the code simply traverses the matching posting tree in TID order. When doing partial matching the code need to be able to return the union of all TIDs in all the matching posting trees in TID order (to be able to do AND and OR operations with multiple search keys later). It does this by traversing them posting tree after posting tree and collecting them all in a TIDBitmap which is later iterated over. This TIDBitmap becomes lossy if it too many TIDs are added to it, and this case is what broke amgettuple for partial matches. To fix this it seems to me that either lossy pages would need to be rescanned by gingettuple or a version of collectMatchBitmap needs to be written which merges the matched posting trees in another way, probably by iterating over all of them at the same time. Does this diagnosis sound correct? = Footnotes 1. http://www.postgresql.org/message-id/CA+TgmobZhfRJNyz-fyw5kDtRurK0HjWP0vtP5fGZLE6eVSWCQw@mail.gmail.com -- Andreas Karlsson
On 11/29/2013 01:13 AM, Andreas Karlsson wrote: > When doing partial matching the code need to be able to return the union > of all TIDs in all the matching posting trees in TID order (to be able > to do AND and OR operations with multiple search keys later). It does > this by traversing them posting tree after posting tree and collecting > them all in a TIDBitmap which is later iterated over. I think it's not a plain union. My understanding is that - to evaluate a single key (typically array) - you first need to get all the TID streams for that key (i.e. one posting list/tree per element of the key array) and then iterate all these streams in parallel and 'merge' them using consistent() function. That's how I understand ginget.c:keyGetItem(). So the problem of partial match is (IMO) that there can be too many TID streams to merge - much more than the number of elements of the key array. // Antonin Houska (Tony)
On 11/29/2013 09:54 AM, Antonin Houska wrote: > On 11/29/2013 01:13 AM, Andreas Karlsson wrote: > >> When doing partial matching the code need to be able to return the union >> of all TIDs in all the matching posting trees in TID order (to be able >> to do AND and OR operations with multiple search keys later). It does >> this by traversing them posting tree after posting tree and collecting >> them all in a TIDBitmap which is later iterated over. > > I think it's not a plain union. My understanding is that - to evaluate a > single key (typically array) - you first need to get all the TID streams > for that key (i.e. one posting list/tree per element of the key array) > and then iterate all these streams in parallel and 'merge' them using > consistent() function. That's how I understand ginget.c:keyGetItem(). For partial matches the merging is done in two steps: first a simple union of all the streams per key and then second merging those union streams using the consistent() function. It is the first step that can be lossy. > So the problem of partial match is (IMO) that there can be too many TID > streams to merge - much more than the number of elements of the key array. Agreed. -- Andreas Karlsson
On 11/29/2013 01:57 PM, Andreas Karlsson wrote: > On 11/29/2013 09:54 AM, Antonin Houska wrote: >> On 11/29/2013 01:13 AM, Andreas Karlsson wrote: >> >>> When doing partial matching the code need to be able to return the union >>> of all TIDs in all the matching posting trees in TID order (to be able >>> to do AND and OR operations with multiple search keys later). It does >>> this by traversing them posting tree after posting tree and collecting >>> them all in a TIDBitmap which is later iterated over. >> >> I think it's not a plain union. My understanding is that - to evaluate a >> single key (typically array) - you first need to get all the TID streams >> for that key (i.e. one posting list/tree per element of the key array) >> and then iterate all these streams in parallel and 'merge' them using >> consistent() function. That's how I understand ginget.c:keyGetItem(). > > For partial matches the merging is done in two steps: first a simple > union of all the streams per key and then second merging those union > streams using the consistent() function. Yes, short after I sent my previous mail I realized that your "union" probably referred to the things that collectMatchBitmap() does. // Tony
Andreas Karlsson <andreas@proxel.se> writes: > I decided to look into how much work implementing the todo item about > supporting amgettuple in GIN would be, since exclusion constraints on > GIN would be neat. Robert Haas suggested a solution[1], but to fix it we > also need to look into why the commit message mentions that it did not > work anyway with the partial matches. > ... > This TIDBitmap becomes lossy if it too many TIDs are added to it, and > this case is what broke amgettuple for partial matches. Right, see http://www.postgresql.org/message-id/49AC300F.1050903@enterprisedb.com Note that fixing the potential lossiness in scanning is not the only roadblock to re-enabling amgettuple. Fast updates also pose problems: http://www.postgresql.org/message-id/4974B002.3040202@sigaev.ru Half of that is basically the same lossiness problem, but the other half is that we're relying on the bitmap to suppress duplicate reports of the same TID. It's fairly hard to see how you'd avoid that without creating other problems. Note that Robert's proposed solution is no solution, because it just puts you right back in the bind of needing guaranteed non-lossy storage of a TID set that might be too big to fit in memory. regards, tom lane
On 11/29/2013 07:13 PM, Tom Lane wrote: > Andreas Karlsson <andreas@proxel.se> writes: >> I decided to look into how much work implementing the todo item about >> supporting amgettuple in GIN would be, since exclusion constraints on >> GIN would be neat. Robert Haas suggested a solution[1], but to fix it we >> also need to look into why the commit message mentions that it did not >> work anyway with the partial matches. >> ... >> This TIDBitmap becomes lossy if it too many TIDs are added to it, and >> this case is what broke amgettuple for partial matches. > > Right, see > https://urldefense.proofpoint.com/v1/url?u=http://www.postgresql.org/message-id/49AC300F.1050903%40enterprisedb.com&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=xGch4oNJbpD%2BKPJECmgw4SLBhytSZLBX7UnkZhtNcpw%3D%0A&m=OqhHlGFG81LH1EqJLzTW8HuXdXslGEL%2FPu1f27HxV%2Bs%3D%0A&s=9f3fad064e2845bd2b99c85f684d237fbe96e542081e4b2dc49b1fe51f91f144 > > Note that fixing the potential lossiness in scanning is not the only > roadblock to re-enabling amgettuple. Fast updates also pose problems: > https://urldefense.proofpoint.com/v1/url?u=http://www.postgresql.org/message-id/4974B002.3040202%40sigaev.ru&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=xGch4oNJbpD%2BKPJECmgw4SLBhytSZLBX7UnkZhtNcpw%3D%0A&m=OqhHlGFG81LH1EqJLzTW8HuXdXslGEL%2FPu1f27HxV%2Bs%3D%0A&s=0e08a781fcc17a3d68ce247344a3499a23a9f545b937f254439dadfaf7b9b8ab > > Half of that is basically the same lossiness problem, but the other > half is that we're relying on the bitmap to suppress duplicate reports > of the same TID. It's fairly hard to see how you'd avoid that without > creating other problems. > > Note that Robert's proposed solution is no solution, because it just > puts you right back in the bind of needing guaranteed non-lossy > storage of a TID set that might be too big to fit in memory. You can always call amgetbitmap, and return the tuples from the bitmap one by one. For a lossy result, re-check all tuples on the page. IOW, do a bitmap index + heap scan. You could do that within indexam.c, and present the familiar index_getnext() interface for callers. Or you could modify the exclusion constraint code to do that if amgettuple is not available - Heikki
On 11/29/2013 06:13 PM, Tom Lane wrote: > Note that Robert's proposed solution is no solution, because it just > puts you right back in the bind of needing guaranteed non-lossy > storage of a TID set that might be too big to fit in memory. The solution should work if we could guarantee that a TIDBitmap based on the fast update pending list always will fit in the memory. That does not sound like a good assumption to me. -- Andreas Karlsson