Re: why is gist index taking so much space on the disc - Mailing list pgsql-hackers

From Martijn van Oosterhout
Subject Re: why is gist index taking so much space on the disc
Date
Msg-id 20051121183214.GF16764@svana.org
Whole thread Raw
In response to why is gist index taking so much space on the disc  (Grzegorz Jaskiewicz <gj@pointblue.com.pl>)
List pgsql-hackers
On Mon, Nov 21, 2005 at 04:58:25PM +0100, Grzegorz Jaskiewicz wrote:
> my conquers with Gist index for custom type are nearly finished. It
> is working as it is now, but there are few problems here and there.
> One of em, being amount of disc space index it self takes. The type
> stucture it self takes 160bytes. Adding 100.000 rows into table -
> CREATE TABLE blah (a serial, b customType);

Let's see, 160bytes means you'll get aboud 50 keys per page. So you
would expect 2000 leaf page, 40 level 1 pages. This should be less than
20-30MB

<snip>

> Is it normal that index is so hudge ? Even tho my type has built in
> masks (element that can match few different values), and %. up front
> the string (which behaves just like the sql % in b ~ '%.something').
> And both are used to build "unions" for pick-split, and other
> operations. Is it because of pick-split it self ? It does good work
> in splitting up table of elements into two separate ones, by sorting
> them first, than creating common "mask" for L and P. And by scanning
> whole table again, and putting elements matching into L or P. L and P
> elements sometimes overlap, but so far I can't find better solution.

You mean you sometimes put the same elements in the two halves? You
shouldn't do that. The whole point is that the search will descend any
node that matches consistant, but any single key should only appear
once in each index.

picksplit should *split* the set, not return two sets about the same
size as you started...

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PostgreSQL 8.1.0 catalog corruption
Next
From: "Jim C. Nasby"
Date:
Subject: Re: MERGE vs REPLACE