Thread: Primary Key Index Bloat?

Primary Key Index Bloat?

From
David Johnston
Date:
Hi,

I have a table with the following usage characteristic:

INSERT bulk data using INSERT INTO ... SELECT ... FROM
-- this table uses a varchar(50) for the PK
-- the PK is rarely (effectively never unless a mistake was made) altered
-- always appending to the existing table; some bulk deletions in the past
-- no toasting of values

UPDATE these records many, many times (but never the PK)

At the moment the table itself (and thus all columns) is showing as size of
just under 5GB
The corresponding index for the PK (a single column, also present on the
table...) is between 8-10GB [1]  I have never REINDEXed this table that I
can recall.

I am guessing that it is the need for the index to point to new versions of
the physical record that the index is churning so much and causing this kind
of bloat?

[1] I actually inadvertently have two indexes, one unique and one not, over
the same field.  The sizes of these two indexes are the rough endpoints of
the 8-10GB range I provided.  It appears that both indexes are being chosen
for use by the planner.

I am preparing to REINDEX the unique index and DROP the non-unique one over
the same field - probably Tuesday evening.  Does everything I am saying here
sound kosher or would someone like me to provide additional information?

The deletions were fairly voluminous when they occurred and so I would have
expected that enough empty pages would have been freed to avoid any
significant index bloat.

I have mostly left VACUUMing to auto-vacuum though the occasional manual one
has been performed.  Never performed VACUUM FULL.

Been running 9.0 on Ubuntu 10.04 since day one with a pg_restore load.  A
while ago and no comparative metrics available.

I greatly appreciate any sanity checks you can provide.

Thanks!

David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/Primary-Key-Index-Bloat-tp5778840.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Primary Key Index Bloat?

From
Chris Ernst
Date:
On 11/17/2013 11:48 PM, David Johnston wrote:
> I am guessing that it is the need for the index to point to new versions of
> the physical record that the index is churning so much and causing this kind
> of bloat?

Bingo.

> I am preparing to REINDEX the unique index and DROP the non-unique one over
> the same field - probably Tuesday evening.  Does everything I am saying here
> sound kosher or would someone like me to provide additional information?
...
> I have mostly left VACUUMing to auto-vacuum though the occasional manual one
> has been performed.  Never performed VACUUM FULL.

REINDEX will only rebuild the indexes (obviously) and VACUUM FULL would
clean up any bloat in the indexes and the table itself.  Either would
require an exclusive lock on the table.

If the exclusive lock is a problem, you might look in to pg_repack as an
option as well.  It essentially recreates the table in parallel then
swaps in the newly built one.  It only requires an exclusive lock for a
moment.

    - Chris


Re: Primary Key Index Bloat?

From
Sergey Konoplev
Date:
On Sun, Nov 17, 2013 at 10:48 PM, David Johnston <polobo@yahoo.com> wrote:
> I am preparing to REINDEX the unique index and DROP the non-unique one over
> the same field - probably Tuesday evening.  Does everything I am saying here
> sound kosher or would someone like me to provide additional information?

[...]

> I have mostly left VACUUMing to auto-vacuum though the occasional manual one
> has been performed.  Never performed VACUUM FULL.

You can try pgcompact from PgToolkit [1]

pgcompact  -d dbname -t tablename -r -v info

that will concurrently reindex those indexes that are bloated and will
remove bloat from the table without recreating it in the case if it is
bloated only. It might require pgstattuple extension [2] to be
installed [3] .

[1] https://github.com/grayhemp/pgtoolkit
[2] http://www.postgresql.org/docs/9.0/static/pgstattuple.html
[3] http://www.postgresql.org/docs/9.0/static/contrib.html

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com