Re: Fixing GIN for empty/null/full-scan cases - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fixing GIN for empty/null/full-scan cases
Date
Msg-id 2883.1295386757@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fixing GIN for empty/null/full-scan cases  (David E. Wheeler <david@kineticode.com>)
Responses Re: Fixing GIN for empty/null/full-scan cases  ("David E. Wheeler" <david@kineticode.com>)
List pgsql-hackers
"David E. Wheeler" <david@kineticode.com> writes:
> One of the reasons our client wants GIN for the integer[] column so bad is because recreating the GiST integer[]
indexis quite painful. Before I duped the table, I was just dropping and recreating the index on the original table. It
wasgreat to create the GIN index; for 400K rows, it took 1300 ms. After my initial round of testing, I dropped it and
createdthe GiST index. That ran for�well, *hours*. Four at least, maybe six or seven (I forgot \timing and was letting
itrun on screen while I did some iOS hacking). I think something might be really wrong with GiST index creation for
integerarrays, because the difference was just appalling. 
 

I poked into this a bit, although it's really off-topic for what I was
doing in GIN, and I agree that there is something fundamentally rotten
in the gist__int_ops logic.  oprofile shows that the code is spending
all its time in g_int_compress and g_int_union during index build:

samples  %        app name                 symbol name
52747    40.2486  _int.so                  g_int_compress
27092    20.6726  libc-2.12.2.so           _wordcopy_fwd_aligned
18938    14.4506  _int.so                  compASC
18256    13.9302  postgres                 pg_qsort
5741      4.3807  no-vmlinux               /no-vmlinux
3297      2.5158  postgres                 swapfunc
864       0.6593  _int.so                  g_int_decompress
826       0.6303  _int.so                  _int_unique
700       0.5341  _int.so                  inner_int_union
617       0.4708  postgres                 med3
588       0.4487  libc-2.12.2.so           memset
371       0.2831  _int.so                  g_int_same
143       0.1091  libc-2.12.2.so           memcpy
123       0.0939  postgres                 ArrayGetNItems
67        0.0511  _int.so                  inner_int_inter
48        0.0366  postgres                 AllocSetAlloc
47        0.0359  libc-2.12.2.so           memmove
40        0.0305  postgres                 MemoryContextAllocZero

The time in g_int_compress is all on this loop:
       while (len > MAXNUMRANGE * 2)       {           min = 0x7fffffff;           for (i = 2; i < len; i += 2)
     if (min > (dr[i] - dr[i - 1]))               {                   min = (dr[i] - dr[i - 1]);                   cand
=i;               }           memmove((void *) &dr[cand - 1], (void *) &dr[cand + 1], (len - cand - 1) *
sizeof(int32));          len -= 2;       }
 

which is not too surprising since that's obviously O(N^2).  The other
high-percentage functions are qsort and subsidiaries, and those calls
are coming from g_int_union.  That part could probably be sped up, since
most of the calls are unioning two inputs, which could be done a lot
cheaper than this (the inputs are always presorted no?).  But there is
something really fundamentally wrong in the logic, I think, because
while poking at it with gdb I saw it union-ing fifty-thousand-element
arrays.  Surely it should get lossy before it gets to that.  I'm also
wondering how it tells the difference between lossy and non-lossy keys
--- although it's hard to tell what such spectacularly uncommented code
is supposed to be doing, it looks like the lossy case is supposed to be
pairs of values representing ranges, in which case g_int_compress is
surely doing the Wrong Thing with already-lossy inputs, and union'ing
lossy inputs is an entirely insane operation as well.

At the moment my opinion is that gist__int_ops is too broken to be
usable, and it's also too uncommented to be fixable by anyone other
than the original author.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: ToDo List Item - System Table Index Clustering
Next
From: Pavel Stehule
Date:
Subject: Re: Re: patch: fix performance problems with repated decomprimation of varlena values in plpgsql