Re: FSM versus GIN pending list bloat - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: FSM versus GIN pending list bloat
Date
Msg-id 55C0673E.40302@iki.fi
Whole thread Raw
In response to FSM versus GIN pending list bloat  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On 08/04/2015 08:03 AM, Jeff Janes wrote:
> For a GIN index with fastupdate turned on, both the user backends and
> autoanalyze routine will clear out the pending list, pushing the entries
> into the normal index structure and deleting the pages used by the pending
> list.  But those deleted pages will not get added to the freespace map
> until a vacuum is done.  This leads to horrible bloat on insert only
> tables, as it is never vacuumed and so the pending list space is never
> reused.  And the pending list is very inefficient in space usage to start
> with, even compared to the old style posting lists and especially compared
> to the new compressed ones.  (If they were aggressively recycled, this
> inefficient use wouldn't be much of a problem.)

Good point.

> The attached proof of concept patch greatly improves the bloat for both the
> insert and the update cases.  You need to turn on both features: adding the
> pages to fsm, and vacuuming the fsm, to get the benefit (so JJ_GIN=3).  The
> first of those two things could probably be adopted for real, but the
> second probably is not acceptable.  What is the right way to do this?
> Could a variant of RecordFreeIndexPage bubble the free space up the map
> immediately rather than waiting for a vacuum?  It would only have to move
> up until it found a page with freespace already recorded in it, which the
> vast majority of the time would mean observing up one level and then not
> writing to it, assuming the pending list pages remain well clustered.

Yep, that sounds good.

- Heikki




pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: tablecmds.c and lock hierarchy
Next
From: Michael Paquier
Date:
Subject: Re: WIP: SCRAM authentication