Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch) - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)
Date
Msg-id CAFiTN-t_mOzS9XLdu7R_3vFb7B+Kw8rzNDQ=cub7f6TTtdPffQ@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: pg_stat_statements issue with parallel maintenance (Was Re: WALusage calculation patch)  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tue, Mar 31, 2020 at 12:20 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Mar 31, 2020 at 10:44 AM Masahiko Sawada
> <masahiko.sawada@2ndquadrant.com> wrote:
> >
> > On Tue, 31 Mar 2020 at 12:58, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Mar 30, 2020 at 12:31 PM Masahiko Sawada
> > > <masahiko.sawada@2ndquadrant.com> wrote:
> > > >
> > > > The patch for vacuum conflicts with recent changes in vacuum. So I've
> > > > attached rebased one.
> > > >
> > >
> > > + /*
> > > + * Next, accumulate buffer usage.  (This must wait for the workers to
> > > + * finish, or we might get incomplete data.)
> > > + */
> > > + for (i = 0; i < nworkers; i++)
> > > + InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > > +
> > >
> > > This should be done for launched workers aka
> > > lps->pcxt->nworkers_launched.  I think a similar problem exists in
> > > create index related patch.
> >
> > You're right. Fixed in the new patches.
> >
> > On Mon, 30 Mar 2020 at 17:00, Julien Rouhaud <rjuju123@gmail.com> wrote:
> > >
> > > Just minor nitpicking:
> > >
> > > +   int         i;
> > >
> > >     Assert(!IsParallelWorker());
> > >     Assert(ParallelVacuumIsActive(lps));
> > > @@ -2166,6 +2172,13 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
> > >     /* Wait for all vacuum workers to finish */
> > >     WaitForParallelWorkersToFinish(lps->pcxt);
> > >
> > > +   /*
> > > +    * Next, accumulate buffer usage.  (This must wait for the workers to
> > > +    * finish, or we might get incomplete data.)
> > > +    */
> > > +   for (i = 0; i < nworkers; i++)
> > > +       InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > >
> > > We now allow declaring a variable in those loops, so it may be better to avoid
> > > declaring i outside the for scope?
> >
> > We can do that but I was not sure if it's good since other codes
> > around there don't use that. So I'd like to leave it for committers.
> > It's a trivial change.
>
> I have reviewed the patch and the patch looks fine to me.
>
> One minor comment
> /+ /* Points to buffer usage are in DSM */
> + BufferUsage *buffer_usage;
> +
> /buffer usage are in DSM / buffer usage area in DSM
>

While testing I have found one issue.  Basically, during a parallel
vacuum, it was showing more number of
shared_blk_hits+shared_blks_read.  After, some investigation, I found
that during the cleanup phase nworkers are -1, and because of this we
didn't try to launch worker but "lps->pcxt->nworkers_launched" had the
old launched worker count and shared memory also had old buffer read
data which was never updated as we did not try to launch the worker.

diff --git a/src/backend/access/heap/vacuumlazy.c
b/src/backend/access/heap/vacuumlazy.c
index b97b678..5dfaf4d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2150,7 +2150,8 @@ lazy_parallel_vacuum_indexes(Relation *Irel,
IndexBulkDeleteResult **stats,
         * Next, accumulate buffer usage.  (This must wait for the workers to
         * finish, or we might get incomplete data.)
         */
-       for (i = 0; i < lps->pcxt->nworkers_launched; i++)
+       nworkers = Min(nworkers, lps->pcxt->nworkers_launched);
+       for (i = 0; i < nworkers; i++)
                InstrAccumParallelQuery(&lps->buffer_usage[i]);

It worked after the above fix.



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: WAL usage calculation patch
Next
From: Ashutosh Bapat
Date:
Subject: Re: A bug when use get_bit() function for a long bytea string