Vacuum, visibility maps and SKIP_PAGES_THRESHOLD - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Vacuum, visibility maps and SKIP_PAGES_THRESHOLD
Date
Msg-id BANLkTinpmxFAQynFbqx0Li2uT=z58bPEiw@mail.gmail.com
Whole thread Raw
Responses Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
I wonder if we have tested the reasoning behind having
SKIP_PAGES_THRESHOLD and the magic number of 32 assigned to it
currently. While looking at the code after a long time and doing some
tests, I realized that a manual VACUUM would always scan first 31
pages of a relation which has not received any write activity since
the last VACUUM. On closer inspection, I realized that we have
deliberately put in this hook to ensure that we use visibility maps
only when we see at least SKIP_PAGES_THRESHOLD worth of all-visible
sequential pages to take advantage of possible OS seq scan
optimizations.

My statistical skills are limited, but wouldn't that mean that for a
fairly well distributed write activity across a large table, if there
are even 3-4% update/deletes, we would most likely hit a
not-all-visible page for every 32 pages scanned ? That would mean that
almost entire relation will be scanned even if the visibility map
tells us that only 3-4% pages require scanning ?  And the probability
will increase with the increase in the percentage of updated/deleted
tuples. Given that the likelihood of anyone calling VACUUM (manually
or through autovac settings) on a table which has less than 3-4%
updates/deletes is very low, I am worried that might be loosing all
advantages of visibility maps for a fairly common use case.

Do we have any numbers to prove what we have today is good ? Sorry, I
may not have followed the discussions very closely in the past and not
sure if this has been debated/tested already.

Thanks,
Pavan





--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: dblink crash on PPC
Next
From: Tom Lane
Date:
Subject: Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD