Re: autovacuum: change priority of the vacuumed tables - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: autovacuum: change priority of the vacuumed tables |
Date | |
Msg-id | CAD21AoC9fwtneY00pRTBasFbyDPA=gcv-31pZhTqkPHrg9VA6w@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum: change priority of the vacuumed tables (Grigory Smolkin <g.smolkin@postgrespro.ru>) |
Responses |
Re: autovacuum: change priority of the vacuumed tables
|
List | pgsql-hackers |
On Thu, Feb 15, 2018 at 10:16 PM, Grigory Smolkin <g.smolkin@postgrespro.ru> wrote: > On 02/15/2018 09:28 AM, Masahiko Sawada wrote: > >> Hi, >> >> On Thu, Feb 8, 2018 at 11:01 PM, Ildus Kurbangaliev >> <i.kurbangaliev@postgrespro.ru> wrote: >>> >>> Hi, >>> >>> Attached patch adds 'autovacuum_table_priority' to the current list of >>> automatic vacuuming settings. It's used in sorting of vacuumed tables in >>> autovacuum worker before actual vacuum. >>> >>> The idea is to give possibility to the users to prioritize their tables >>> in autovacuum process. >>> >> Hmm, I couldn't understand the benefit of this patch. Would you >> elaborate it a little more? >> >> Multiple autovacuum worker can work on one database. So even if a >> table that you want to vacuum first is the back of the list and there >> other worker would pick up it. If the vacuuming the table gets delayed >> due to some big tables are in front of that table I think you can deal >> with it by increasing the number of autovacuum workers. >> >> Regards, >> >> -- >> Masahiko Sawada >> NIPPON TELEGRAPH AND TELEPHONE CORPORATION >> NTT Open Source Software Center >> > > Database can contain thousands of tables and often updates/deletes > concentrate mostly in only a handful of tables. > Going through thousands of less bloated tables can take ages. > Currently autovacuum know nothing about prioritizing it`s work with respect > to user`s understanding of his data and application. Understood. I have a question; please imagine the following case. Suppose that there are 1000 tables in a database, and one table of them (table-A) has the highest priority while other 999 tables have same priority. Almost tables (say 800 tables) including table-A need to get vacuumed at some point, so with your patch an AV worker listed 800 tables and table-A will be at the head of the list. Table-A will get vacuumed first but this AV worker has to vacuum other 799 tables even if table-A requires vacuum later again. If an another AV worker launches during table-A being vacuumed, the new AV worker would include table-A but would not process it because concurrent AV worker is processing it. So it would vacuum other tables instead. Similarly, this AV worker can not get the new table list until finish to vacuum all other tables. (Note that it might skip some tables if they are already vacuumed by other AV worker.) On the other hand, if another new AV worker launches after table-A got vacuumed and requires vacuuming again, the new AV worker puts the table-A at the head of list. It processes table-A first but, again, it has to vacuum other tables before getting new table list next time that might include table-A. Is this the expected behavior? I'd rather expect postgres to vacuum it before other lower priority tables whenever the table having the highest priority requires vacuuming, but it wouldn't. > Also It`s would be great to sort tables according to dead/live tuple ratio > and relfrozenxid. Yeah, for anti-wraparound vacuum on the database, it would be good idea to sort the list by relfrozenxid as discussed on another thread[1], [1] https://www.postgresql.org/message-id/CA%2BTgmobT3m%3D%2BdU5HF3VGVqiZ2O%2Bv6P5wN1Gj%2BPrq%2Bhj7dAm9AQ%40mail.gmail.com Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: