Thread: Wasted Vacuum cycles when OldestXmin is not moving

Wasted Vacuum cycles when OldestXmin is not moving

From

sirisha chamarthi

Date:

11 January 2023, 00:46:19

Hi Hackers,

vacuum is not able to clean up dead tuples when OldestXmin is not moving (because of a long running transaction or when hot_standby_feedback is behind). Even though OldestXmin is not moved from the last time it checked, it keeps retrying every autovacuum_naptime and wastes CPU cycles and IOs when pages are not in memory. Can we not bypass the dead tuple collection and cleanup step until OldestXmin is advanced? Below log shows the vacuum running every 1 minute.

2023-01-09 08:13:01.364 UTC [727219] LOG: automatic vacuum of table "postgres.public.t1": index scans: 0
pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
removable cutoff: 852, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 13939 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.15 s, system: 0.00 s, elapsed: 0.29 s
2023-01-09 08:14:01.363 UTC [727289] LOG: automatic vacuum of table "postgres.public.t1": index scans: 0
pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
removable cutoff: 852, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 13939 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.14 s, system: 0.00 s, elapsed: 0.29 s

Thanks,

Sirisha

Re: Wasted Vacuum cycles when OldestXmin is not moving

From

Bharath Rupireddy

Date:

23 January 2023, 09:21:34

On Wed, Jan 11, 2023 at 3:16 AM sirisha chamarthi
<sirichamarthi22@gmail.com> wrote:
>
> Hi Hackers,
>
> vacuum is not able to clean up dead tuples when OldestXmin is not moving (because of a long running transaction or
whenhot_standby_feedback is behind). Even though OldestXmin is not moved from the last time it checked, it keeps
retryingevery autovacuum_naptime and wastes CPU cycles and IOs when pages are not in memory. Can we not bypass the dead
tuplecollection and cleanup step until OldestXmin is advanced? Below log shows the vacuum running every 1 minute. 
>
> 2023-01-09 08:13:01.364 UTC [727219] LOG:  automatic vacuum of table "postgres.public.t1": index scans: 0
>         pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
>         tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
>         removable cutoff: 852, which was 2 XIDs old when operation ended
>         frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
>         index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
>         avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
>         buffer usage: 13939 hits, 0 misses, 0 dirtied
>         WAL usage: 0 records, 0 full page images, 0 bytes
>         system usage: CPU: user: 0.15 s, system: 0.00 s, elapsed: 0.29 s
> 2023-01-09 08:14:01.363 UTC [727289] LOG:  automatic vacuum of table "postgres.public.t1": index scans: 0
>         pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
>         tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
>         removable cutoff: 852, which was 2 XIDs old when operation ended
>         frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
>         index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
>         avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
>         buffer usage: 13939 hits, 0 misses, 0 dirtied
>         WAL usage: 0 records, 0 full page images, 0 bytes
>         system usage: CPU: user: 0.14 s, system: 0.00 s, elapsed: 0.29 s

Can you provide a patch and test case, if possible, a TAP test with
and without patch?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com