Thread: Wasted Vacuum cycles when OldestXmin is not moving
Hi Hackers,
vacuum is not able to clean up dead tuples when OldestXmin is not moving (because of a long running transaction or when hot_standby_feedback is behind). Even though OldestXmin is not moved from the last time it checked, it keeps retrying every
autovacuum_naptime
and wastes CPU cycles and IOs when pages are not in memory. Can we not bypass the dead tuple collection and cleanup step until OldestXmin is advanced? Below log shows the vacuum running every 1 minute.2023-01-09 08:13:01.364 UTC [727219] LOG: automatic vacuum of table "postgres.public.t1": index scans: 0
pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
removable cutoff: 852, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 13939 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.15 s, system: 0.00 s, elapsed: 0.29 s
2023-01-09 08:14:01.363 UTC [727289] LOG: automatic vacuum of table "postgres.public.t1": index scans: 0
pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
removable cutoff: 852, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 13939 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.14 s, system: 0.00 s, elapsed: 0.29 s
pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
removable cutoff: 852, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 13939 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.15 s, system: 0.00 s, elapsed: 0.29 s
2023-01-09 08:14:01.363 UTC [727289] LOG: automatic vacuum of table "postgres.public.t1": index scans: 0
pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total)
tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable
removable cutoff: 852, which was 2 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
buffer usage: 13939 hits, 0 misses, 0 dirtied
WAL usage: 0 records, 0 full page images, 0 bytes
system usage: CPU: user: 0.14 s, system: 0.00 s, elapsed: 0.29 s
Thanks,
Sirisha
On Wed, Jan 11, 2023 at 3:16 AM sirisha chamarthi <sirichamarthi22@gmail.com> wrote: > > Hi Hackers, > > vacuum is not able to clean up dead tuples when OldestXmin is not moving (because of a long running transaction or whenhot_standby_feedback is behind). Even though OldestXmin is not moved from the last time it checked, it keeps retryingevery autovacuum_naptime and wastes CPU cycles and IOs when pages are not in memory. Can we not bypass the dead tuplecollection and cleanup step until OldestXmin is advanced? Below log shows the vacuum running every 1 minute. > > 2023-01-09 08:13:01.364 UTC [727219] LOG: automatic vacuum of table "postgres.public.t1": index scans: 0 > pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total) > tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable > removable cutoff: 852, which was 2 XIDs old when operation ended > frozen: 0 pages from table (0.00% of total) had 0 tuples frozen > index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed > avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s > buffer usage: 13939 hits, 0 misses, 0 dirtied > WAL usage: 0 records, 0 full page images, 0 bytes > system usage: CPU: user: 0.15 s, system: 0.00 s, elapsed: 0.29 s > 2023-01-09 08:14:01.363 UTC [727289] LOG: automatic vacuum of table "postgres.public.t1": index scans: 0 > pages: 0 removed, 6960 remain, 6960 scanned (100.00% of total) > tuples: 0 removed, 1572864 remain, 786432 are dead but not yet removable > removable cutoff: 852, which was 2 XIDs old when operation ended > frozen: 0 pages from table (0.00% of total) had 0 tuples frozen > index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed > avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s > buffer usage: 13939 hits, 0 misses, 0 dirtied > WAL usage: 0 records, 0 full page images, 0 bytes > system usage: CPU: user: 0.14 s, system: 0.00 s, elapsed: 0.29 s Can you provide a patch and test case, if possible, a TAP test with and without patch? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com