Thread: BUG #13327: Error on insert (gist index)

BUG #13327: Error on insert (gist index)

From
cpro29a@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      13327
Logged by:          Sergey Melekhin
Email address:      cpro29a@gmail.com
PostgreSQL version: 9.3.6
Operating system:   Ubuntu 14.04
Description:

I stumbled upon error, when multiple inserts or updates on table with gist
index on many int[] columns fails.
I got "stack depth limit exceded" on update, but when I started to write
this test case I did not manage to get to updates - I got error when trying
to populate table:
ERROR:  failed to add item to index page in "test_gist_i"

CODE:

begin;

create table test_gist(id serial primary key, a1 int[],a2 int[], a3 int[],
a4 int[], a5 int[]);
create index test_gist_i on test_gist using gist (a1,a2,a3,a4,a5);

--Simple random array generator
create or replace function rnd_arr(p_len int) returns int[]
as $$
 select array_agg(val)
 from (select trunc(random()*1000.0)::int val
       from generate_series(1,trunc(random()*p_len*1.0)::int+1)
       ) v;
$$ language sql;


--##THIS FAILS:
insert into test_gist(a1,a2,a3,a4,a5)
select rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10)
  from generate_series(1,100000);
;

--##WITH MESSAGE:
--ERROR:  failed to add item to index page in "test_gist_i"

rollback;

Re: BUG #13327: Error on insert (gist index)

From
Tom Lane
Date:
cpro29a@gmail.com writes:
> create table test_gist(id serial primary key, a1 int[],a2 int[], a3 int[],
> a4 int[], a5 int[]);
> create index test_gist_i on test_gist using gist (a1,a2,a3,a4,a5);

> --Simple random array generator
> create or replace function rnd_arr(p_len int) returns int[]
> as $$
>  select array_agg(val)
>  from (select trunc(random()*1000.0)::int val
>        from generate_series(1,trunc(random()*p_len*1.0)::int+1)
>        ) v;
> $$ language sql;

> --##THIS FAILS:
> insert into test_gist(a1,a2,a3,a4,a5)
> select rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10)
>   from generate_series(1,100000);
> --##WITH MESSAGE:
> --ERROR:  failed to add item to index page in "test_gist_i"

I can reproduce this on HEAD (after installing contrib/intarray).
It's not very clear if this is a core-code bug or a fault in intarray's
picksplit function, but the stack trace is suspicious:

#0  errfinish (dummy=0) at elog.c:408
#1  0x00000000007b4df8 in elog_finish (elevel=<value optimized out>,
    fmt=<value optimized out>) at elog.c:1368
#2  0x000000000048464c in gistplacetopage (rel=0x7f7dfc2f2ff0,
    freespace=140735375882944, giststate=0xffffffff,
    buffer=<value optimized out>, itup=0x287f950, ntup=<value optimized out>,
    oldoffnum=0, newblkno=0x0, leftchildbuf=6689, splitinfo=0x7fff82163b88,
    markfollowright=1 '\001') at gist.c:358
#3  0x0000000000484cfb in gistinserttuples (state=0x7fff82164470,
    stack=0x7fff82164490, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6689, rightchild=6687, unlockbuf=0 '\000',
    unlockleftchild=0 '\000') at gist.c:1142
#4  0x0000000000484eb3 in gistfinishsplit (state=0x7fff82164470,
    stack=0x21ad028, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=0 '\000') at gist.c:1223
#5  0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x21ad028, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6694, rightchild=6692, unlockbuf=0 '\000',
    unlockleftchild=0 '\000') at gist.c:1167
#6  0x0000000000484eb3 in gistfinishsplit (state=0x7fff82164470,
    stack=0x255cb10, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=0 '\000') at gist.c:1223
#7  0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x255cb10, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6698, rightchild=6696, unlockbuf=0 '\000',
    unlockleftchild=0 '\000') at gist.c:1167
#8  0x0000000000484eb3 in gistfinishsplit (state=0x7fff82164470,
    stack=0x2578830, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=0 '\000') at gist.c:1223
#9  0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x2578830, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6707, rightchild=6700, unlockbuf=0 '\000',
    unlockleftchild=0 '\000') at gist.c:1167
#10 0x0000000000484eb3 in gistfinishsplit (state=0x7fff82164470,
    stack=0x25942e8, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=0 '\000') at gist.c:1223
#11 0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x25942e8, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6711, rightchild=6709, unlockbuf=0 '\000',
    unlockleftchild=0 '\000') at gist.c:1167
#12 0x0000000000484eb3 in gistfinishsplit (state=0x7fff82164470,
    stack=0x259f0b0, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=0 '\000') at gist.c:1223
#13 0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x259f0b0, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6714, rightchild=6713, unlockbuf=0 '\000',
    unlockleftchild=0 '\000') at gist.c:1167
#14 0x0000000000484eb3 in gistfinishsplit (state=0x7fff82164470,
    stack=0x25cbdc8, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=0 '\000') at gist.c:1223
#15 0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x25cbdc8, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6718, rightchild=6717, unlockbuf=0 '\000',
    unlockleftchild=0 '\000') at gist.c:1167
#16 0x0000000000484eb3 in gistfinishsplit (state=0x7fff82164470,
    stack=0x25e7880, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=1 '\001') at gist.c:1223
#17 0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x25e7880, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>,
    leftchild=6740, rightchild=6720, unlockbuf=1 '\001',
    unlockleftchild=0 '\000') at gist.c:1167
#18 0x0000000000484f3a in gistfinishsplit (state=0x7fff82164470,
    stack=0x2603338, giststate=0x21b67e8, splitinfo=<value optimized out>,
    unlockbuf=0 '\000') at gist.c:1248
#19 0x0000000000484d9b in gistinserttuples (state=0x7fff82164470,
    stack=0x2603338, giststate=0x21b67e8, tuples=<value optimized out>,
    ntup=<value optimized out>, oldoffnum=<value optimized out>, leftchild=0,
    rightchild=0, unlockbuf=0 '\000', unlockleftchild=0 '\000') at gist.c:1167
#20 0x0000000000484fb9 in gistinserttuple (state=<value optimized out>,
    stack=<value optimized out>, giststate=<value optimized out>,
    tuple=0x2605e78, oldoffnum=<value optimized out>) at gist.c:1101
#21 0x000000000048587e in gistdoinsert (r=0x7f7dfc2f2ff0, itup=0x21ba158,
    freespace=<value optimized out>, giststate=0x21b67e8) at gist.c:672
#22 0x0000000000485b79 in gistinsert (fcinfo=<value optimized out>)
    at gist.c:132
#23 0x00000000007b7984 in FunctionCall6Coll (flinfo=<value optimized out>,
    collation=<value optimized out>, arg1=<value optimized out>,
    arg2=<value optimized out>, arg3=<value optimized out>,
    arg4=<value optimized out>, arg5=140179078598768, arg6=0) at fmgr.c:1436
#24 0x00000000004aac8f in index_insert (indexRelation=0x7f7dfc2f2ff0,
    values=0x7fff82164a10, isnull=0x7fff821649f0 "", heap_t_ctid=0x203f81c,
    heapRelation=0x7f7dfc2f5070, checkUnique=UNIQUE_CHECK_NO) at indexam.c:226
#25 0x00000000005c7b1d in ExecInsertIndexTuples (slot=0x2035d80,
    tupleid=0x203f81c, estate=0x20357b8, noDupErr=0 '\000', specConflict=0x0,
    arbiterIndexes=0x0) at execIndexing.c:384
#26 0x00000000005e49cd in ExecInsert (node=0x2035a20) at nodeModifyTable.c:459
#27 ExecModifyTable (node=0x2035a20) at nodeModifyTable.c:1432


Should gistfinishsplit really be recursing like that?

Also, if you execute the INSERT a second time, it fails immediately,
apparently from trying to finish the uncompleted page split.

            regards, tom lane

Re: BUG #13327: Error on insert (gist index)

From
Tom Lane
Date:
I wrote:
> cpro29a@gmail.com writes:
>> create table test_gist(id serial primary key, a1 int[],a2 int[], a3 int[],
>> a4 int[], a5 int[]);
>> create index test_gist_i on test_gist using gist (a1,a2,a3,a4,a5);

>> --Simple random array generator
>> create or replace function rnd_arr(p_len int) returns int[]
>> as $$
>> select array_agg(val)
>> from (select trunc(random()*1000.0)::int val
>> from generate_series(1,trunc(random()*p_len*1.0)::int+1)
>> ) v;
>> $$ language sql;

>> --##THIS FAILS:
>> insert into test_gist(a1,a2,a3,a4,a5)
>> select rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10)
>> from generate_series(1,100000);
>> --##WITH MESSAGE:
>> --ERROR:  failed to add item to index page in "test_gist_i"

> I can reproduce this on HEAD (after installing contrib/intarray).

On closer inspection, the answer is "you should be using gist__intbig_ops".
gist__intbig_ops uses a different representation that can deal with more
distinct array entries than the default gist_int_ops opclass can.

The reason for the failure is that the index entries have gotten up to
over 4K apiece, so that splitting does not help (or maybe that GIST
thinks it should be able to put two entries in a page and it can't).

It'd be nice if the error thrown was a little clearer, but I don't think
there is anything we can or should do to make this example actually succeed.

Note that gist_int_ops' g_int_compress function does actually contain a
check that warns you once you've got more than 200 distinct values in an
index item.  Unfortunately, this example is trying to cram five arrays
into each index tuple, so the total tuple size exceeds what GIST can cope
with before that check fires for any one array value.  It doesn't look to
me like g_int_compress has any way to know how many index columns there
are, else maybe it could adjust its threshold for complaining.

It might be reasonable for the core GIST code to emit some complaint about
index tuple size rather than the very opaque "failed to add item" message.
Not sure exactly what test is appropriate though.  Should we be limiting
GIST index entries to half a page?

            regards, tom lane

Re: BUG #13327: Error on insert (gist index)

From
Сергей Мелехин
Date:
<p dir="ltr">Oh, thanks! I forgot to add gist_intbig_ops.<div class="gmail_quote">22 мая 2015 г. 4:57 пользователь "Tom
Lane"<<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> написал:<br type="attribution" /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I wrote:<br /> > <a
href="mailto:cpro29a@gmail.com">cpro29a@gmail.com</a>writes:<br /> >> create table test_gist(id serial primary
key,a1 int[],a2 int[], a3 int[],<br /> >> a4 int[], a5 int[]);<br /> >> create index test_gist_i on
test_gistusing gist (a1,a2,a3,a4,a5);<br /><br /> >> --Simple random array generator<br /> >> create or
replacefunction rnd_arr(p_len int) returns int[]<br /> >> as $$<br /> >> select array_agg(val)<br />
>>from (select trunc(random()*1000.0)::int val<br /> >> from
generate_series(1,trunc(random()*p_len*1.0)::int+1)<br/> >> ) v;<br /> >> $$ language sql;<br /><br />
>>--##THIS FAILS:<br /> >> insert into test_gist(a1,a2,a3,a4,a5)<br /> >> select
rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10),rnd_arr(10)<br/> >> from generate_series(1,100000);<br />
>>--##WITH MESSAGE:<br /> >> --ERROR:  failed to add item to index page in "test_gist_i"<br /><br /> > I
canreproduce this on HEAD (after installing contrib/intarray).<br /><br /> On closer inspection, the answer is "you
shouldbe using gist__intbig_ops".<br /> gist__intbig_ops uses a different representation that can deal with more<br />
distinctarray entries than the default gist_int_ops opclass can.<br /><br /> The reason for the failure is that the
indexentries have gotten up to<br /> over 4K apiece, so that splitting does not help (or maybe that GIST<br /> thinks
itshould be able to put two entries in a page and it can't).<br /><br /> It'd be nice if the error thrown was a little
clearer,but I don't think<br /> there is anything we can or should do to make this example actually succeed.<br /><br
/>Note that gist_int_ops' g_int_compress function does actually contain a<br /> check that warns you once you've got
morethan 200 distinct values in an<br /> index item.  Unfortunately, this example is trying to cram five arrays<br />
intoeach index tuple, so the total tuple size exceeds what GIST can cope<br /> with before that check fires for any one
arrayvalue.  It doesn't look to<br /> me like g_int_compress has any way to know how many index columns there<br />
are,else maybe it could adjust its threshold for complaining.<br /><br /> It might be reasonable for the core GIST code
toemit some complaint about<br /> index tuple size rather than the very opaque "failed to add item" message.<br /> Not
sureexactly what test is appropriate though.  Should we be limiting<br /> GIST index entries to half a page?<br /><br
/>                        regards, tom lane<br /></blockquote></div>