Re: maintenance_work_mem used by Vacuum - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: maintenance_work_mem used by Vacuum |
Date | |
Msg-id | CAA4eK1K7ZzKJ5TitbSoo4BvMSdM_koJH=ac2JpYBF2vepmG01A@mail.gmail.com Whole thread Raw |
In response to | Re: maintenance_work_mem used by Vacuum (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: maintenance_work_mem used by Vacuum
Re: maintenance_work_mem used by Vacuum |
List | pgsql-hackers |
On Wed, Oct 9, 2019 at 2:00 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Oct 9, 2019 at 10:22 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Oct 8, 2019 at 2:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Tue, Oct 8, 2019 at 1:48 AM Peter Geoghegan <pg@bowt.ie> wrote: > > > > > > > > ISTM that the use of maintenance_work_mem wasn't given that much > > > > thought originally. > > > > > > > > > > One idea to something better could be to check, if there is a GIN > > > index on a table, then use 1/4 (25% or whatever) of > > > maintenance_work_mem for GIN indexes and 3/4 (75%) of > > > maintenance_work_mem for collection dead tuples. > > > > > > > I felt that it would not be easy for users to tune > > maintenance_work_mem which controls more than one things. If this is > > an index AM(GIN) specific issue we might rather want to control the > > memory limit of pending list cleanup by a separate GUC parameter like > > gin_pending_list_limit, say gin_pending_list_work_mem. Sure, by having another work_mem parameter for the Gin indexes which controls when we need to flush the pending list will make life easier as a programmer. I think if we have a specific parameter for this purpose, then we can even think of using the same for a clean up during insert operation as well. However, I am not sure how easy it would be for users? Basically, now they need to remember another parameter and for which there is no easy way to know what should be the value. I think one has to check gin_metapage_info->n_pending_pages and then based on that they can configure the value for this parameter to get the maximum benefit possible. Can we think of using work_mem for this? Basically, we use work_mem during insert operation, so why not use it during vacuum operation for this purpose? Another idea could be to try to divide the maintenance_work_mem smartly if we know the value of pending_pages for each Gin index, but I think for that we need to either read the metapage of maybe use some sort of stats which can be used by vacuum. We need to somehow divide it based on the amount of memory required for a number of dead tuples in heap and memory required by tuples in the pending list. I am not sure how feasible is this approach. About difficulty for users tuning one or two parameters for vacuum, I think if they can compute what could be the values for Guc's separately, then why can't they add up and set it as one value. Having said that, I am not denying that having a separate parameter gives better control, and for this specific case using separate parameter can allow us to use it both during vacuum and insert operations. > > And we can > > either set the (the memory for GIN pending list cleanup / # of GIN > > indexes) to the parallel workers. > > > IMHO if we do that then we will loose the meaning of having > maintenance_work_mem right? Then user can not control that how much > memory the autovacuum worker will use. > I am not sure how different it is from the current situation? Basically, now it can use up to 2 * maintenance_work_mem memory and if we do what Sawada-San is proposing, then it will be maintenance_work_mem + gin_*_work_mem. Do you have some other alternative idea in mind or you think the current situation is better than anything else we can do in this area? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: