Thread: what specifically does vacuum have to scan / why does it need to rescan the same indexes many, many times


What does vacuum have to scan to be able to reclaim space, and how many times does it need to scan to finalize ?

More specifically, my VACUUM VERBOSE <table> is taking a long time and seems to be rescanning the same indexes / fields multiple times without finishing.

db=# vacuum verbose testruns;
INFO:  vacuuming "public.testruns"
INFO:  scanned index "testruns_pkey" to remove 2795912 row versions
DETAIL:  CPU 11.92s/20.58u sec elapsed 55.07 sec.
INFO:  scanned index "index_testruns_on_build_os" to remove 2795912 row versions
DETAIL:  CPU 13.66s/87.02u sec elapsed 127.30 sec.
...
INFO:  scanned index "testruns_custom_data_type" to remove 2795913 row versions
DETAIL:  CPU 103.91s/38.58u sec elapsed 625.51 sec.
INFO:  scanned index "testruns_lower_machine" to remove 2795913 row versions
DETAIL:  CPU 1.22s/1.75u sec elapsed 4.83 sec.
INFO:  "testruns": removed 2795913 row versions in 846930 pages
DETAIL:  CPU 12.83s/34.68u sec elapsed 79.09 sec.
INFO:  scanned index "testruns_pkey" to remove 2795913 row versions
DETAIL:  CPU 11.17s/17.62u sec elapsed 42.38 sec.
INFO:  scanned index "index_testruns_on_build_os" to remove 2795913 row versions
DETAIL:  CPU 11.20s/16.26u sec elapsed 52.65 sec.
...

This has now occurred at least half a dozen times:

% pbpaste | grep "testruns_pkey"
INFO:  scanned index "testruns_pkey" to remove 2795912 row versions
INFO:  scanned index "testruns_pkey" to remove 2795914 row versions
INFO:  scanned index "testruns_pkey" to remove 2795912 row versions
INFO:  scanned index "testruns_pkey" to remove 2795913 row versions
INFO:  scanned index "testruns_pkey" to remove 2795913 row versions
INFO:  scanned index "testruns_pkey" to remove 2795913 row versions
INFO:  scanned index "testruns_pkey" to remove 2795912 row versions
INFO:  scanned index "testruns_pkey" to remove 2795914 row versions

I'm aware that certain things like creating concurrent indexes needs to scan a table multiple times for completeness, but I'm not sure why vacuum needs to rescan all the fields and indexes over and over again without any seeming progress.

This is Postgres 9.3.3 on OS X 10.9.*

thanks,

~ john
On Tuesday, July 1, 2014, john gale <john@smadness.com> wrote:

What does vacuum have to scan to be able to reclaim space, and how many times does it need to scan to finalize ?

More specifically, my VACUUM VERBOSE <table> is taking a long time and seems to be rescanning the same indexes / fields multiple times without finishing.

vacuum can only memorize one dead tuple for every 6 bytes of maintenance_work_mem.  If there are more dead tuples than that, it needs to make multiple passes over the indexes.
 
Increase maintenance_work_mem to improve the efficiency.

Cheers,

Jeff