Re: Index Bloat Problem - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Index Bloat Problem
Date
Msg-id CAMkU=1ysTfuS+7DkvmhASN4rzzTJLGxkb4QKd-r6gFV6jkwwBQ@mail.gmail.com
Whole thread Raw
In response to Re: Index Bloat Problem  (Strahinja Kustudić <strahinjak@nordeus.com>)
Responses Re: Index Bloat Problem
List pgsql-performance
On Thu, Aug 16, 2012 at 12:57 PM, Strahinja Kustudić
<strahinjak@nordeus.com> wrote:
>
> @Jeff I'm not sure if I understand what you mean? I know that we never reuse
> key ranges. Could you be more clear, or give an example please.

If an index leaf page is completely empty because every entry on it
were deleted, it will get recycled to be used in some other part of
the index.  (Eventually--it can take a while, especially if you have
long-running transactions).

But if the leaf page is only mostly empty, because only most of
entries on it were deleted, than it can never be reused, except for
entries that naturally fall into its existing key range (which will
never happen, if you never reuse key ranges)

So if you have a million records with keys 1..1000000, and do a
"delete from foo where key between 1 and 990000", then 99% of those
old index pages will become completely empty and eligible for reuse.
But if you do "delete from foo where key%100>0", then all of the pages
will become 99% empty, and none will be eligible for reuse (except the
very last one, which can still accept 1000001 and so on)

There has  been talk of allowing logically adjacent, mostly empty
pages to be merged so that one of them becomes empty, but the way
concurrent access to btree indexes was designed this is extremely hard
to do safely.

Cheers,

Jeff


pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Increasing WAL usage followed by sudden drop
Next
From: Greg Williamson
Date:
Subject: Re: Index Bloat Problem