Re: autovacuum: change priority of the vacuumed tables - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: autovacuum: change priority of the vacuumed tables |
Date | |
Msg-id | CAD21AoCHQhsga9GkJWqMqcsmG4eKbcKSncqronWRNxXATphFaQ@mail.gmail.com Whole thread Raw |
In response to | Re: autovacuum: change priority of the vacuumed tables (Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>) |
Responses |
Re: autovacuum: change priority of the vacuumed tables
(Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>)
|
List | pgsql-hackers |
On Fri, Feb 16, 2018 at 7:50 PM, Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru> wrote: > On Fri, 16 Feb 2018 17:42:34 +0900 > Masahiko Sawada <sawada.mshk@gmail.com> wrote: > >> On Thu, Feb 15, 2018 at 10:16 PM, Grigory Smolkin >> <g.smolkin@postgrespro.ru> wrote: >> > On 02/15/2018 09:28 AM, Masahiko Sawada wrote: >> > >> >> Hi, >> >> >> >> On Thu, Feb 8, 2018 at 11:01 PM, Ildus Kurbangaliev >> >> <i.kurbangaliev@postgrespro.ru> wrote: >> >>> >> >>> Hi, >> >>> >> >>> Attached patch adds 'autovacuum_table_priority' to the current >> >>> list of automatic vacuuming settings. It's used in sorting of >> >>> vacuumed tables in autovacuum worker before actual vacuum. >> >>> >> >>> The idea is to give possibility to the users to prioritize their >> >>> tables in autovacuum process. >> >>> >> >> Hmm, I couldn't understand the benefit of this patch. Would you >> >> elaborate it a little more? >> >> >> >> Multiple autovacuum worker can work on one database. So even if a >> >> table that you want to vacuum first is the back of the list and >> >> there other worker would pick up it. If the vacuuming the table >> >> gets delayed due to some big tables are in front of that table I >> >> think you can deal with it by increasing the number of autovacuum >> >> workers. >> >> >> >> Regards, >> >> >> >> -- >> >> Masahiko Sawada >> >> NIPPON TELEGRAPH AND TELEPHONE CORPORATION >> >> NTT Open Source Software Center >> >> >> > >> > Database can contain thousands of tables and often updates/deletes >> > concentrate mostly in only a handful of tables. >> > Going through thousands of less bloated tables can take ages. >> > Currently autovacuum know nothing about prioritizing it`s work with >> > respect to user`s understanding of his data and application. >> >> Understood. I have a question; please imagine the following case. >> >> Suppose that there are 1000 tables in a database, and one table of >> them (table-A) has the highest priority while other 999 tables have >> same priority. Almost tables (say 800 tables) including table-A need >> to get vacuumed at some point, so with your patch an AV worker listed >> 800 tables and table-A will be at the head of the list. Table-A will >> get vacuumed first but this AV worker has to vacuum other 799 tables >> even if table-A requires vacuum later again. >> >> If an another AV worker launches during table-A being vacuumed, the >> new AV worker would include table-A but would not process it because >> concurrent AV worker is processing it. So it would vacuum other tables >> instead. Similarly, this AV worker can not get the new table list >> until finish to vacuum all other tables. (Note that it might skip some >> tables if they are already vacuumed by other AV worker.) On the other >> hand, if another new AV worker launches after table-A got vacuumed and >> requires vacuuming again, the new AV worker puts the table-A at the >> head of list. It processes table-A first but, again, it has to vacuum >> other tables before getting new table list next time that might >> include table-A. >> >> Is this the expected behavior? I'd rather expect postgres to vacuum it >> before other lower priority tables whenever the table having the >> highest priority requires vacuuming, but it wouldn't. > > Yes, this is the expected behavior. The patch is the way to give the > user at least some control of the sorting, later it could be extended > with something more sophisticated. > Since user doesn't know that each AV worker processes tables based on its table list that is different from lists that other worker has, I think it's hard for user to understand this parameter. I'd say that user would expect that high priority table can get vacuumed any time. I think what you want to solve is to vacuum some tables preferentially if there are many tables requiring vacuuming. Right? If so, I think the prioritizing table only in the list would not solve the fundamental issue. In the example, table-A will still need to wait for other 799 tables to get vacuumed. Table-A will be bloating during vacuuming other tables. To deal with it, I think we need something queue on the shmem per database in order to control the order of tables waiting for vacuuming and need to use it with a smart algorithm. Thoughts? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: