Crash in gist insertion on pathological box data - Mailing list pgsql-hackers

From Andrew Gierth
Subject Crash in gist insertion on pathological box data
Date
Msg-id 8763hwl56e.fsf@news-spur.riddles.org.uk
Whole thread Raw
Responses Re: Crash in gist insertion on pathological box data  (Sergey Konoplev <gray.ru@gmail.com>)
Re: Crash in gist insertion on pathological box data  (Sergey Konoplev <gray.ru@gmail.com>)
Re: Crash in gist insertion on pathological box data  (Martijn van Oosterhout <kleptog@svana.org>)
Re: Crash in gist insertion on pathological box data  (Teodor Sigaev <teodor@sigaev.ru>)
List pgsql-hackers
A user on IRC reported a crash (backend segfault) in GiST insertion
(in 8.3.5 but I can reproduce this in today's HEAD) that turns out
to be due to misbehaviour of gist_box_picksplit.

The nature of the problem is this: if gist_box_picksplit doesn't find
a good disposition on the first try, then it tries to split the data
again based on the positions of the box centers. But there's a problem
here with floating-point rounding; it's possible for the average of N
floating-point values to be strictly greater (or less) than all of the
values individually, and the function then returns with, for example,
all the entries assigned to the left node, and nothing in the right
node. This causes gistSplit to try and split the left node again, with
predictable results.

Here is a test case:

file of floating-point values here (999 lines):
http://www.rhodiumtoad.org.uk/junk/badfloats.txt

create table floats3(x float8, y float8);
\copy floats3 from 'badfloats.txt'
create table boxes1 (b box);
create index boxes1_idx on boxes1 using gist (b);
insert into boxes1 select box(point(x,x),point(y,y)) as b from floats3;
[crash]

I'm not sure what the best fix is. I would think that it would make
sense for gistUserPickSplit to error out if the user's split function
returned an empty left or right node, since that would seem to
guarantee this problem. Certainly gist_box_picksplit also needs some
sort of fix to try and split sensibly in the presence of data of this
type.

-- 
Andrew (irc:RhodiumToad)


pgsql-hackers by date:

Previous
From: "Srinath K"
Date:
Subject: global index - work in progress patch
Next
From: Greg Stark
Date:
Subject: Re: global index - work in progress patch