Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD - Mailing list pgsql-hackers

From Cédric Villemain
Subject Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD
Date
Msg-id BANLkTimPfL8ZieKjWAas57og1v9Nh7BT_A@mail.gmail.com
Whole thread Raw
In response to Re: Vacuum, visibility maps and SKIP_PAGES_THRESHOLD  (Cédric Villemain <cedric.villemain.debian@gmail.com>)
List pgsql-hackers
2011/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:
> 2011/5/27 Pavan Deolasee <pavan.deolasee@gmail.com>:
>> I wonder if we have tested the reasoning behind having
>> SKIP_PAGES_THRESHOLD and the magic number of 32 assigned to it
>> currently. While looking at the code after a long time and doing some
>> tests, I realized that a manual VACUUM would always scan first 31
>> pages of a relation which has not received any write activity since
>> the last VACUUM. On closer inspection, I realized that we have
>> deliberately put in this hook to ensure that we use visibility maps
>> only when we see at least SKIP_PAGES_THRESHOLD worth of all-visible
>> sequential pages to take advantage of possible OS seq scan
>> optimizations.
>>
>> My statistical skills are limited, but wouldn't that mean that for a
>> fairly well distributed write activity across a large table, if there
>> are even 3-4% update/deletes, we would most likely hit a
>> not-all-visible page for every 32 pages scanned ? That would mean that
>
> The page skip is still based on VM.
> So you wonder what are the chances of a VM not up-to-date when we access it ?

re-reading the mails and answers, I misunderstood the case you exposed.

>
>> almost entire relation will be scanned even if the visibility map
>> tells us that only 3-4% pages require scanning ?  And the probability
>> will increase with the increase in the percentage of updated/deleted
>> tuples. Given that the likelihood of anyone calling VACUUM (manually
>> or through autovac settings) on a table which has less than 3-4%
>> updates/deletes is very low, I am worried that might be loosing all
>> advantages of visibility maps for a fairly common use case.
>>
>> Do we have any numbers to prove what we have today is good ? Sorry, I
>> may not have followed the discussions very closely in the past and not
>> sure if this has been debated/tested already.
>>
>> Thanks,
>> Pavan
>>
>>
>>
>>
>>
>> --
>> Pavan Deolasee
>> EnterpriseDB     http://www.enterprisedb.com
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>
>
> --
> Cédric Villemain               2ndQuadrant
> http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: What is the best and easiest implementation to reliably wait for the completion of startup?
Next
From: Robert Haas
Date:
Subject: Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum