Re: Long-running and non-finishing VACUUM ANALYZE on large table - Mailing list pgsql-admin

From Guillaume Lelarge
Subject Re: Long-running and non-finishing VACUUM ANALYZE on large table
Date
Msg-id CAECtzeVFuU5oe1OKbRvfo-rWL+q=F80Ju2MyX=4C4MZzXSKH-A@mail.gmail.com
Whole thread Raw
In response to Re: Long-running and non-finishing VACUUM ANALYZE on large table  (Jan <pgsql.admin@j.mk-contact.de>)
List pgsql-admin
2015-10-10 16:38 GMT+02:00 Jan <pgsql.admin@j.mk-contact.de>:

On 10/07/2015 01:53 AM, Tom Lane wrote:
Such an UPDATE should have left the table 50% dead tuples, since every
row would leave behind a dead version.  On the other hand, an ALTER
COLUMN TYPE operation should rewrite the whole table and leave no dead
tuples behind.  No matter which one you did last, it doesn't square with
0.32% dead tuples.

My best guess at this point is that what you did last is an UPDATE,
so you have 50% dead tuples, and for some reason pgstattuple is not
telling you the truth about that.  But the VACUUM is showing reality.

How long did those UPDATEs and ALTER TABLEs take?  If an ALTER seemed
tolerable then maybe what you want to do is VACUUM FULL, which would
be roughly the same cost.               

Tom, as you predicted, the VACUUM FULL finished relatively quickly yesterday after about 31h. Here is the verbose output:

INFO:  vacuuming "public.protein_hsps"
INFO:  "protein_hsps": found 63187655 removable, 11353611882 nonremovable row versions in 181253461 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 2814.17s/8479.90u sec elapsed 15451.26 sec.
Total query runtime: 114969739 ms.


Here is the output from pgstattuple (before / after vacuum):

-- SELECT * FROM pgstattuple('protein_hsps');
--
-- yielded:

table_len    1484828352512  / 958853496832
tuple_count    11353611882  / 11353611882
tuple_len    874228114914 / 874228114914
tuple_percent    58.88 / 91.17
dead_tuple_count    63187655 / 0
dead_tuple_len    4810998304 / 0
dead_tuple_percent    0.32 / 0
free_space    495246133064 / 1872767456
free_percent    33.35 / 0.2


And the table sizes (before / after vacuum):

-- SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_relation_size(C.oid)) AS "size"
-- FROM pg_class C LEFT JOIN pg_namespace N ON N.oid = C.relnamespace
-- WHERE relname IN ('protein_hsps', 'protein_hsps_clustidx_on_origin', 'protein_hsps_pkey')
--
-- yielded:

public.protein_hsps    1383 GB  / 893 GB
public.protein_hsps_clustidx_on_origin    499 GB  / 238 GB
public.protein_hsps_pkey    494 GB  / 238 GB


The only thing which I currently not understand is the pgstattuple output, which tells me that there is 0.2% free space left. Actually there are 7.3T (i.e., 71%) left on the device so this is either a wrong display or this value refers to something other than the free disk space on the drive.


Actually, it is the free space inside the table's files. It has nothing to do with your device (PostgreSQL doesn't know how much free space is available on your disks).


--

pgsql-admin by date:

Previous
From: Jan
Date:
Subject: Re: Long-running and non-finishing VACUUM ANALYZE on large table
Next
From: KhunSanAung
Date:
Subject: How to create users for a login Role?