Home > mailing lists

Re: gistchoose vs. bloat - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: gistchoose vs. bloat
Date	December 14, 2012 18:12:14
Msg-id	1355508729.11945.26.camel@jdavis Whole thread
In response to	Re: gistchoose vs. bloat (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses	Re: gistchoose vs. bloat
List	pgsql-hackers

Tree view

On Fri, 2012-12-14 at 18:36 +0200, Heikki Linnakangas wrote:
> One question: does the randomization ever help when building a new 
> index? In the original test case, you repeatedly delete and insert 
> tuples, and I can see how the index can get bloated in that case. But I 
> don't see how bloat would occur when building the index from scratch.

When building an index on a bunch of identical int4range values (in my
test, [1,10) ), the resulting index was about 17% smaller.

If the current algorithm always chooses to insert on the left-most page,
then it seems like there would be a half-filled right page for every
split that occurs. Is that reasoning correct?

However, I'm having some second thoughts about the run time for index
builds. Maybe we should have a few more tests to determine if this
should really be the default or just an option?

> BTW, I don't much like the option name "randomization". It's not clear 
> what's been randomized. I'd prefer something like 
> "distribute_on_equal_penalty", although that's really long. Better ideas?

I agree that "randomization" is vague, but I can't think of anything
better.

Regards,Jeff Davis

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 14 December 2012, 16:36:50
Subject: Re: gistchoose vs. bloat

From: Robert Haas
Date: 14 December 2012, 18:37:34
Subject: Re: Use of systable_beginscan_ordered in event trigger patch

Re: gistchoose vs. bloat - Mailing list pgsql-hackers

Previous

Next