Re: SP-GiST bug. - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: SP-GiST bug. |
Date | |
Msg-id | 14826.1401316338@sss.pgh.pa.us Whole thread Raw |
In response to | SP-GiST bug. (Teodor Sigaev <teodor@sigaev.ru>) |
Responses |
Re: SP-GiST bug.
|
List | pgsql-hackers |
Teodor Sigaev <teodor@sigaev.ru> writes: > create table xxx ( t text); > insert into xxx select 'x' || v::text from generate_series(1, 291) as v; > insert into xxx values (''); > create index xxxidx on xxx using spgist (t); Fun! > And postgres will eat memory forever. It seems to me that checkAllTheSame > wrongly decides that all tuples are the same. All tuples except tuple to be > inserted have the same character 'x' and only new tuple has '\0'. I don't think that checkAllTheSame is broken, but it probably would be after this patch --- ISTM you're breaking the corner case described in the header comment for checkAllTheSame, where we need to force splitting of a set of existing tuples that the opclass can't distinguish. What actually is broken, I think, is the spgist text opclass, because it's using a zero node label for two different situations. The scenario we're looking at here is: 1. We call picksplit with all 292 of these entries. Not surprisingly, it puts the first 291 into a single node with label 'x' and the last one into another node with label '\0'. 2. Because this is too many to fit on one index page, checkAllTheSame decides it had better create an allTheSame tuple for the first 291 and then try again to insert the empty string into that tuple. While you could argue whether this is *necessary*, it is not *incorrect*, so ISTM the opclass had better cope with it. 3. We call spg_text_choose with the allTheSame tuple and the empty-string value to be inserted. It finds the empty-string value doesn't match (since all the nodes have label 'x') so it requests a split: out->resultType = spgSplitTuple; out->result.splitTuple.prefixHasPrefix = in->hasPrefix; out->result.splitTuple.prefixPrefixDatum= in->prefixDatum; out->result.splitTuple.nodeLabel = UInt8GetDatum('\0'); out->result.splitTuple.postfixHasPrefix = false; After splitting, we have a new upper tuple with a single node with label '\0', pointing to a new allTheSame tuple containing the original 291 entries. 4. We call spg_text_choose with the new upper tuple and the empty-string value to be inserted. It looks for a node labeled '\0', and finds it, so it reports that we should descend into that node --- ie, down to the new allTheSame tuple. 5. We call spg_text_choose with the new allTheSame tuple and the empty-string value to be inserted. This is exactly like the situation at step 3, so we're in an infinite loop. It doesn't look to me like the core code has done anything wrong here; it just did what the opclass requested. Rather, the problem is we're using '\0' node label both to represent a "no op" label descent and to represent "we're past the end of the string". I'm afraid there may not be any way to fix this without breaking on-disk compatibility for spgist text indexes :-(. It seems like we have to distinguish the case of zero-length string from that of a dummy label for a pushed-down allTheSame tuple. regards, tom lane
pgsql-hackers by date: