Re: SP-GiST bug. - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: SP-GiST bug. |
Date | |
Msg-id | 15869.1401393098@sss.pgh.pa.us Whole thread Raw |
In response to | Re: SP-GiST bug. (Teodor Sigaev <teodor@sigaev.ru>) |
Responses |
Re: SP-GiST bug.
|
List | pgsql-hackers |
Teodor Sigaev <teodor@sigaev.ru> writes: >> I don't think that checkAllTheSame is broken, but it probably would be >> after this patch --- ISTM you're breaking the corner case described in the >> header comment for checkAllTheSame, where we need to force splitting of a >> set of existing tuples that the opclass can't distinguish. > INHO, this patch will not break - because it forces (in corner case) to create a > node for new tuple but exclude it from list to insert. On next iteration we will > find a needed node with empty list. > Comment: > * If we know that the leaf tuples wouldn't all fit on one page, then we > * exclude the last tuple (which is the incoming new tuple that forced a split) > * from the check to see if more than one node is used. The reason for this > * is that if the existing tuples are put into only one chain, then even if > * we move them all to an empty page, there would still not be room for the > * new tuple, so we'd get into an infinite loop of picksplit attempts. > If we reserve a node for new tuple then on next iteration we will not fall into > picksplit again - we will have an empty list. So new tuple will fall into > another page. The point I'm making is that the scenario your test case exposes is not an infinite loop of picksplits, but an infinite loop of choose calls. There are certainly ways to get into that loop that don't involve this specific checkAllTheSame logic. Here's an example: regression=# create table xxx ( t text); CREATE TABLE regression=# create index xxxidx on xxx using spgist (t); CREATE INDEX regression=# insert into xxx select repeat('x',4000); INSERT 0 1 regression=# insert into xxx select repeat('x',4000); INSERT 0 1 regression=# insert into xxx select repeat('x',4000); INSERT 0 1 regression=# insert into xxx select repeat('x',3996); In this example, we get a picksplit at the third insert, and here we *must* take the allTheSame path because the values are, in fact, all the same (the SPGIST_MAX_PREFIX_LENGTH constraint means we can only put 3996 characters into the prefix, and then the 3997'th character of each value is 'x'). Then the fourth insert triggers this same infinite loop, because spg_text_choose is asked how to insert the slightly-shorter value into the allTheSame tuple, and it does the wrong thing. It's certainly possible that we could/should change what checkAllTheSame is doing --- on re-reading the comment, I'm not really sure that the scenario it's concerned about can happen. However, that will not fix the bug in spgtextproc.c; it will only block one avenue for reaching the bug. regards, tom lane
pgsql-hackers by date: