Re: Automatic aggressive vacuum on almost frozen table takes too long - Mailing list pgsql-general

From Mikhail Balayan
Subject Re: Automatic aggressive vacuum on almost frozen table takes too long
Date
Msg-id CAC2oM1ZjGKEwoOU5KQW9YZRJ8SuZtFZ1+cFbcyFVC9=Bgpx6XA@mail.gmail.com
Whole thread Raw
In response to Re: Automatic aggressive vacuum on almost frozen table takes too long  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Automatic aggressive vacuum on almost frozen table takes too long  (Mikhail Balayan <mv.balayan@gmail.com>)
List pgsql-general
>> Can you run amcheck's bt_index_check() routine against some of the
>> indexes you've shown? There is perhaps some chance that index
>> corruption exists and causes VACUUM to take a very long time to delete
>> index pages. This is pretty much a wild guess, though.

Unfortunately I can't, we haven't enabled this extension yet. And since this is a production, I'm not ready to turn it on right away. But I can say that this theory is unlikely, since this problem occurs on different sites. Here's an example of an output from another site where table size is 46.5 GB (again, data only), indexes 107GB, toast 62MB:
automatic aggressive vacuum of table "appdbname2.appschemaname.applications": index scans: 1
pages: 0 removed, 6091646 remain, 0 skipped due to pins, 6086395 skipped frozen
tuples: 2344 removed, 35295654 remain, 0 are dead but not yet removable, oldest xmin: 213412878
buffer usage: 251980554 hits, 14462331 misses, 18844 dirtied
avg read rate: 12.018 MB/s, avg write rate: 0.016 MB/s
system usage: CPU: user: 7734.43 s, system: 178.98 s, elapsed: 9401.36 s

Here again we see that there are 5251 blocks that need to be cleaned (6091646 - 6086395), buffer usage is 266461729 blocks or ~ 2 TB and processing time is 2.5h+.

>> It's possible that VACUUM had to wait a long time for a cleanup lock
on one individual heap page here
If such a scenario is possible, it makes sense to add information about the blocking waiting time to the output. Something like:
     system usage: CPU: user: 7734.43 s, system: 178.98 s, lock_wait: 1234.56 s, elapsed: 9401.36 s


Mikhail

On Sat, 18 Feb 2023 at 05:35, Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Feb 16, 2023 at 5:40 PM Mikhail Balayan <mv.balayan@gmail.com> wrote:
>> >> Do you have any non-btree indexes on the table? Can you show us the details of the
>> >> table, including all of its indexes? In other words, can you show "\d applications" output from psql?
>
> Only btree indexes. Please find the full table schema below:

It's possible that VACUUM had to wait a long time for a cleanup lock
on one individual heap page here, which could have added a long delay.
But...that doesn't seem particularly likely.

Can you run amcheck's bt_index_check() routine against some of the
indexes you've shown? There is perhaps some chance that index
corruption exists and causes VACUUM to take a very long time to delete
index pages. This is pretty much a wild guess, though.

--
Peter Geoghegan

pgsql-general by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5
Next
From: David Rowley
Date:
Subject: Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5