Thread: Wasted Vacuum cycles when OldestXmin is not moving

Wasted Vacuum cycles when OldestXmin is not moving

From
sirisha chamarthi
Date:
Hi Hackers,

vacuum is not able to clean up dead tuples when OldestXmin is not moving (because of a long running transaction or when hot_standby_feedback is behind). Even though OldestXmin is not moved from the last time it checked, it keeps retrying every autovacuum_naptime and wastes CPU cycles and IOs when pages are not in memory. Can we not bypass the dead tuple collection and cleanup step until OldestXmin is advanced? Below log shows the vacuum running every 1 minute.

2023-01-09 08:13:01.364 UTC [727219] LOG:  automatic vacuum of table "postgres.public.t1": index scans: 0
        pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
        tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
        removable cutoff: 852, which was 2 XIDs old when operation ended
        frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
        index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
        avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
        buffer usage: 13939 hits, 0 misses, 0 dirtied
        WAL usage: 0 records, 0 full page images, 0 bytes
        system usage: CPU: user: 0.15 s, system: 0.00 s, elapsed: 0.29 s
2023-01-09 08:14:01.363 UTC [727289] LOG:  automatic vacuum of table "postgres.public.t1": index scans: 0
        pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
        tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
        removable cutoff: 852, which was 2 XIDs old when operation ended
        frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
        index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
        avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
        buffer usage: 13939 hits, 0 misses, 0 dirtied
        WAL usage: 0 records, 0 full page images, 0 bytes
        system usage: CPU: user: 0.14 s, system: 0.00 s, elapsed: 0.29 s

Thanks,
Sirisha

Re: Wasted Vacuum cycles when OldestXmin is not moving

From
Bharath Rupireddy
Date:
On Wed, Jan 11, 2023 at 3:16 AM sirisha chamarthi
<sirichamarthi22@gmail.com> wrote:
>
> Hi Hackers,
>
> vacuum is not able to clean up dead tuples when OldestXmin is not moving (because of a long running transaction or
whenhot_standby_feedback is behind). Even though OldestXmin is not moved from the last time it checked, it keeps
retryingevery autovacuum_naptime and wastes CPU cycles and IOs when pages are not in memory. Can we not bypass the dead
tuplecollection and cleanup step until OldestXmin is advanced? Below log shows the vacuum running every 1 minute. 
>
> 2023-01-09 08:13:01.364 UTC [727219] LOG:  automatic vacuum of table "postgres.public.t1": index scans: 0
>         pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
>         tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
>         removable cutoff: 852, which was 2 XIDs old when operation ended
>         frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
>         index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
>         avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
>         buffer usage: 13939 hits, 0 misses, 0 dirtied
>         WAL usage: 0 records, 0 full page images, 0 bytes
>         system usage: CPU: user: 0.15 s, system: 0.00 s, elapsed: 0.29 s
> 2023-01-09 08:14:01.363 UTC [727289] LOG:  automatic vacuum of table "postgres.public.t1": index scans: 0
>         pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
>         tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
>         removable cutoff: 852, which was 2 XIDs old when operation ended
>         frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
>         index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
>         avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
>         buffer usage: 13939 hits, 0 misses, 0 dirtied
>         WAL usage: 0 records, 0 full page images, 0 bytes
>         system usage: CPU: user: 0.14 s, system: 0.00 s, elapsed: 0.29 s

Can you provide a patch and test case, if possible, a TAP test with
and without patch?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com