Re: index bloat - Mailing list pgsql-general

From Tom Lane
Subject Re: index bloat
Date
Msg-id 20976.1121283897@sss.pgh.pa.us
Whole thread Raw
In response to Re: index bloat  ("David Esposito" <pgsql-general@esposito.newnetco.com>)
Responses Re: index bloat
List pgsql-general
"David Esposito" <pgsql-general@esposito.newnetco.com> writes:
> Hmm, if I keep running the following query while the test program is going
> (giving it a few iterations to rest between executions), the steady-state
> usage of the indexes seems to go up ... it doesn't happen every time you run
> the query, but if you do it 10 times, it seems to go up at least once every
> few times you run it .. And the usage keeps charging upwards long after the
> UPDATE query finishes (at least 3 or 4 iterations afterwards until it levels
> off again) ... It would seem like the steady-state should be reached after
> the first couple of runs and then never creep up any further because there
> should be enough slack in the index, right?

> UPDATE bigboy SET creation_date = CURRENT_TIMESTAMP
> WHERE creation_date BETWEEN CURRENT_TIMESTAMP - INTERVAL '15 seconds'
> AND CURRENT_TIMESTAMP - INTERVAL '5 seconds';

Hmm, this is preferentially touching stuff near the right end of the
index, ie, it's going to bloat the pages associated with higher keys.
As I understand your usage of these indexes, pages generally only get
reclaimed off the left end (as records get old enough to be moved to
archival storage).  So if you were to repeat this test for long enough
for the data to cycle all the way through the table and out again
(50 days in your real-world usage) then the extra space would be
evenly distributed and the usage would reach a steady state.

The testing I've been doing so far involves UPDATEs that touch a
uniformly distributed subset of the table --- maybe that's the aspect
that is failing to match your reality.  Do you mostly update
recently-added rows?  Can you quantify the effect at all?

> Is there any way to disassemble an index (either through some fancy SQL
> query or by running the actual physical file through a tool) to get an idea
> on where the slack could be accumulating?

No such code exists AFAIK, though the idea is sounding pretty attractive
at the moment ;-).  You could get some crude numbers by adding debug
printouts to btbulkdelete() ...

            regards, tom lane

pgsql-general by date:

Previous
From: "Matt McNeil"
Date:
Subject: Transparent encryption in PostgreSQL?
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Transparent encryption in PostgreSQL?