Re: maintenance_work_mem used by Vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: maintenance_work_mem used by Vacuum
Date
Msg-id CAD21AoBQDWLRJew6iCztmhmA4bVhtq7jgp1FeKVnX07_JdoT-g@mail.gmail.com
Whole thread Raw
In response to Re: maintenance_work_mem used by Vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: maintenance_work_mem used by Vacuum
List pgsql-hackers
On Thu, Oct 10, 2019 at 6:38 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 10, 2019 at 2:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Oct 10, 2019 at 3:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 10, 2019 at 9:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Oct 9, 2019 at 7:12 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > > >
> > > > > I think the current situation is not good but if we try to cap it to
> > > > > maintenance_work_mem + gin_*_work_mem then also I don't think it will
> > > > > make the situation much better.  However, I think the idea you
> > > > > proposed up-thread[1] is better.  At least the  maintenance_work_mem
> > > > > will be the top limit what the auto vacuum worker can use.
> > > > >
> > > >
> > > > I'm concerned that there are other index AMs that could consume more
> > > > memory like GIN. In principle we can vacuum third party index AMs and
> > > > will be able to even parallel vacuum them. I  expect that
> > > > maintenance_work_mem is the top limit of memory usage of maintenance
> > > > command but actually it's hard to set the limit to memory usage of
> > > > bulkdelete and cleanup by the core. So I thought that since GIN is the
> > > > one of the index AM it can have a new parameter to make its job
> > > > faster. If we have that parameter it might not make the current
> > > > situation much better but user will be able to set a lower value to
> > > > that parameter to not use the memory much while keeping the number of
> > > > index vacuums.
> > > >
> > >
> > > I can understand your concern why dividing maintenance_work_mem for
> > > vacuuming heap and cleaning up the index might be tricky especially
> > > because of third party indexes, but introducing new Guc isn't free
> > > either.  I think that should be the last resort and we need buy-in
> > > from more people for that.  Did you consider using work_mem for this?
> >
> > Yeah that seems work too. But I wonder if it could be the similar
> > story to gin_pending_list_limit. I mean that previously we used to use
> >  work_mem as the maximum size of GIN pending list. But we concluded
> > that it was not appropriate to control both by one GUC so we
> > introduced gin_penidng_list_limit and the storage parameter at commit
> > 263865a4
> >
>
> It seems you want to say about commit id
> a1b395b6a26ae80cde17fdfd2def8d351872f399.

Yeah thanks.

>  I wonder why they have not
> changed it to gin_penidng_list_limit (at that time
> pending_list_cleanup_size) in that commit itself?  I think if we want
> to use gin_pending_list_limit in this context then we can replace both
> work_mem and maintenance_work_mem with gin_penidng_list_limit.

Hmm as far as I can see the discussion, no one mentioned about
maintenance_work_mem. Perhaps we just oversighted? I also didn't know
that.

I also think we can replace at least the work_mem for cleanup of
pending list with gin_pending_list_limit. In the following comment in
ginfast.c,

/*
 * Force pending list cleanup when it becomes too long. And,
 * ginInsertCleanup could take significant amount of time, so we prefer to
 * call it when it can do all the work in a single collection cycle. In
 * non-vacuum mode, it shouldn't require maintenance_work_mem, so fire it
 * while pending list is still small enough to fit into
 * gin_pending_list_limit.
 *
 * ginInsertCleanup() should not be called inside our CRIT_SECTION.
 */
cleanupSize = GinGetPendingListCleanupSize(index);
if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * 1024L)
    needCleanup = true;

ISTM the gin_pending_list_limit in the above comment corresponds to
the following code in ginfast.c,

/*
 * We are called from regular insert and if we see concurrent cleanup
 * just exit in hope that concurrent process will clean up pending
 * list.
 */
if (!ConditionalLockPage(index, GIN_METAPAGE_BLKNO, ExclusiveLock))
    return;
workMemory = work_mem;

If work_mem is smaller than gin_pending_list_limit the pending list
cleanup would behave against the intention of the above comment that
prefers to do all the work in a single collection cycle while pending
list is still small enough to fit into gin_pending_list_limit.

Regards,

--
Masahiko Sawada



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: BRIN index which is much faster never chosen by planner
Next
From: Michael Lewis
Date:
Subject: Re: BRIN index which is much faster never chosen by planner