Re: BUG #13327: Error on insert (gist index) - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #13327: Error on insert (gist index)
Date
Msg-id 32487.1432234672@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #13327: Error on insert (gist index)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #13327: Error on insert (gist index)
List pgsql-bugs
I wrote:
> cpro29a@gmail.com writes:
>> create table test_gist(id serial primary key, a1 int[],a2 int[], a3 int[],
>> a4 int[], a5 int[]);
>> create index test_gist_i on test_gist using gist (a1,a2,a3,a4,a5);

>> --Simple random array generator
>> create or replace function rnd_arr(p_len int) returns int[]
>> as $$
>> select array_agg(val)
>> from (select trunc(random()*1000.0)::int val
>> from generate_series(1,trunc(random()*p_len*1.0)::int+1)
>> ) v;
>> $$ language sql;

>> --##THIS FAILS:
>> insert into test_gist(a1,a2,a3,a4,a5)
>> select rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10)
>> from generate_series(1,100000);
>> --##WITH MESSAGE:
>> --ERROR:  failed to add item to index page in "test_gist_i"

> I can reproduce this on HEAD (after installing contrib/intarray).

On closer inspection, the answer is "you should be using gist__intbig_ops".
gist__intbig_ops uses a different representation that can deal with more
distinct array entries than the default gist_int_ops opclass can.

The reason for the failure is that the index entries have gotten up to
over 4K apiece, so that splitting does not help (or maybe that GIST
thinks it should be able to put two entries in a page and it can't).

It'd be nice if the error thrown was a little clearer, but I don't think
there is anything we can or should do to make this example actually succeed.

Note that gist_int_ops' g_int_compress function does actually contain a
check that warns you once you've got more than 200 distinct values in an
index item.  Unfortunately, this example is trying to cram five arrays
into each index tuple, so the total tuple size exceeds what GIST can cope
with before that check fires for any one array value.  It doesn't look to
me like g_int_compress has any way to know how many index columns there
are, else maybe it could adjust its threshold for complaining.

It might be reasonable for the core GIST code to emit some complaint about
index tuple size rather than the very opaque "failed to add item" message.
Not sure exactly what test is appropriate though.  Should we be limiting
GIST index entries to half a page?

            regards, tom lane

pgsql-bugs by date:

Previous
From: "D. Spindel"
Date:
Subject: Re: unrecognized option '--help
Next
From: Michael Paquier
Date:
Subject: Re: unrecognized option '--help