Re: autovacuum: change priority of the vacuumed tables - Mailing list pgsql-hackers
From | Ildus Kurbangaliev |
---|---|
Subject | Re: autovacuum: change priority of the vacuumed tables |
Date | |
Msg-id | 20180219173855.05bd313c@wp.localdomain Whole thread Raw |
In response to | Re: autovacuum: change priority of the vacuumed tables (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: autovacuum: change priority of the vacuumed tables
|
List | pgsql-hackers |
On Fri, 16 Feb 2018 21:48:14 +0900 Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Fri, Feb 16, 2018 at 7:50 PM, Ildus Kurbangaliev > <i.kurbangaliev@postgrespro.ru> wrote: > > On Fri, 16 Feb 2018 17:42:34 +0900 > > Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > >> On Thu, Feb 15, 2018 at 10:16 PM, Grigory Smolkin > >> <g.smolkin@postgrespro.ru> wrote: > >> > On 02/15/2018 09:28 AM, Masahiko Sawada wrote: > >> > > >> >> Hi, > >> >> > >> >> On Thu, Feb 8, 2018 at 11:01 PM, Ildus Kurbangaliev > >> >> <i.kurbangaliev@postgrespro.ru> wrote: > >> >>> > >> >>> Hi, > >> >>> > >> >>> Attached patch adds 'autovacuum_table_priority' to the current > >> >>> list of automatic vacuuming settings. It's used in sorting of > >> >>> vacuumed tables in autovacuum worker before actual vacuum. > >> >>> > >> >>> The idea is to give possibility to the users to prioritize > >> >>> their tables in autovacuum process. > >> >>> > >> >> Hmm, I couldn't understand the benefit of this patch. Would you > >> >> elaborate it a little more? > >> >> > >> >> Multiple autovacuum worker can work on one database. So even if > >> >> a table that you want to vacuum first is the back of the list > >> >> and there other worker would pick up it. If the vacuuming the > >> >> table gets delayed due to some big tables are in front of that > >> >> table I think you can deal with it by increasing the number of > >> >> autovacuum workers. > >> >> > >> >> Regards, > >> >> > >> >> -- > >> >> Masahiko Sawada > >> >> NIPPON TELEGRAPH AND TELEPHONE CORPORATION > >> >> NTT Open Source Software Center > >> >> > >> > > >> > Database can contain thousands of tables and often > >> > updates/deletes concentrate mostly in only a handful of tables. > >> > Going through thousands of less bloated tables can take ages. > >> > Currently autovacuum know nothing about prioritizing it`s work > >> > with respect to user`s understanding of his data and > >> > application. > >> > >> Understood. I have a question; please imagine the following case. > >> > >> Suppose that there are 1000 tables in a database, and one table of > >> them (table-A) has the highest priority while other 999 tables have > >> same priority. Almost tables (say 800 tables) including table-A > >> need to get vacuumed at some point, so with your patch an AV > >> worker listed 800 tables and table-A will be at the head of the > >> list. Table-A will get vacuumed first but this AV worker has to > >> vacuum other 799 tables even if table-A requires vacuum later > >> again. > >> > >> If an another AV worker launches during table-A being vacuumed, the > >> new AV worker would include table-A but would not process it > >> because concurrent AV worker is processing it. So it would vacuum > >> other tables instead. Similarly, this AV worker can not get the > >> new table list until finish to vacuum all other tables. (Note that > >> it might skip some tables if they are already vacuumed by other AV > >> worker.) On the other hand, if another new AV worker launches > >> after table-A got vacuumed and requires vacuuming again, the new > >> AV worker puts the table-A at the head of list. It processes > >> table-A first but, again, it has to vacuum other tables before > >> getting new table list next time that might include table-A. > >> > >> Is this the expected behavior? I'd rather expect postgres to > >> vacuum it before other lower priority tables whenever the table > >> having the highest priority requires vacuuming, but it wouldn't. > > > > Yes, this is the expected behavior. The patch is the way to give the > > user at least some control of the sorting, later it could be > > extended with something more sophisticated. > > > > Since user doesn't know that each AV worker processes tables based on > its table list that is different from lists that other worker has, I > think it's hard for user to understand this parameter. I'd say that > user would expect that high priority table can get vacuumed any time. Yes, very good point. It could be strange for the user in cases like that. > > I think what you want to solve is to vacuum some tables preferentially > if there are many tables requiring vacuuming. Right? If so, I think > the prioritizing table only in the list would not solve the > fundamental issue. In the example, table-A will still need to wait for > other 799 tables to get vacuumed. Table-A will be bloating during > vacuuming other tables. To deal with it, I think we need something > queue on the shmem per database in order to control the order of > tables waiting for vacuuming and need to use it with a smart > algorithm. Thoughts? Agree, it would require some shared queue for the autovacuum workers if we want to prioritize the table across all of them. I will look into this, and maybe will come up with something. Masahiko, are you working on this too or just interested with the idea? -- --- Ildus Kurbangaliev Postgres Professional: http://www.postgrespro.com Russian Postgres Company
pgsql-hackers by date: