Re: Long-running and non-finishing VACUUM ANALYZE on large table - Mailing list pgsql-admin

From Tom Lane
Subject Re: Long-running and non-finishing VACUUM ANALYZE on large table
Date
Msg-id 9857.1444175592@sss.pgh.pa.us
Whole thread Raw
In response to Re: Long-running and non-finishing VACUUM ANALYZE on large table  (Jan <pgsql.admin@j.mk-contact.de>)
Responses Re: Long-running and non-finishing VACUUM ANALYZE on large table  (Jan <pgsql.admin@j.mk-contact.de>)
Re: Long-running and non-finishing VACUUM ANALYZE on large table  (Jan <pgsql.admin@j.mk-contact.de>)
List pgsql-admin
Jan <pgsql.admin@j.mk-contact.de> writes:
> I'm still not getting the math behind it. The below tuple stats show a
> dead tuple count of 63,187,655 whereas the PGadmin output (see my
> initial e-mail) reported the message "scanned index protein_hsps_pkey to
> remove 178956753 row versions" nine times before I cancelled it. That
> is, if one multiplies 178,956,753 by 9 it yields 1,610,610,777 (dead)
> rows. But the latter number is much higher than the above 63m rows? Do I
> compare the wrong numbers?

There's something awfully wacky about that.  I suspect that pgstattuple
is somehow giving wrong answers, but I don't see any plausible theory
as to why.

> Some more background: the whole database was recently migrated to a new
> database server and thus restored from a dump file. That is, the table
> 'protein_hsps' and its index were build from scratch. Since then, the
> only operations on that table were some column type conversions (e.g.,
> integer to smallint, double to real). Data-wise, the only operations
> were UPDATES on a single boolean column by adding precalculated values
> (true/false) to EACH row in the database (~ 16bn rows). These UPDATEs
> were most likely the cause for the (huge) number of dead tuples (0.32%,
> see above), weren't they?

Such an UPDATE should have left the table 50% dead tuples, since every
row would leave behind a dead version.  On the other hand, an ALTER
COLUMN TYPE operation should rewrite the whole table and leave no dead
tuples behind.  No matter which one you did last, it doesn't square with
0.32% dead tuples.

My best guess at this point is that what you did last is an UPDATE,
so you have 50% dead tuples, and for some reason pgstattuple is not
telling you the truth about that.  But the VACUUM is showing reality.

How long did those UPDATEs and ALTER TABLEs take?  If an ALTER seemed
tolerable then maybe what you want to do is VACUUM FULL, which would
be roughly the same cost.

            regards, tom lane


pgsql-admin by date:

Previous
From: Payal Singh
Date:
Subject: Re: Upgrade master / slave
Next
From: Jan
Date:
Subject: Re: Long-running and non-finishing VACUUM ANALYZE on large table