Re: Track the amount of time waiting due to cost_delay - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Track the amount of time waiting due to cost_delay
Date
Msg-id CAFiTN-tF4n1QSUGeoH2hgkGmnk9FaN1z1aWo5Grf_7br=BKyLg@mail.gmail.com
Whole thread Raw
In response to Re: Track the amount of time waiting due to cost_delay  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
List pgsql-hackers
On Mon, Dec 9, 2024 at 6:55 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Dec 09, 2024 at 05:18:30PM +0530, Dilip Kumar wrote:
> > On Mon, Dec 9, 2024 at 2:51 PM Masahiro Ikeda <ikedamsh@oss.nttdata.com> wrote:
> > >
> > > On 2024-12-06 18:31, Bertrand Drouvot wrote:
> > > > Hi,
> > > >
> > > > On Thu, Dec 05, 2024 at 10:43:51AM +0000, Bertrand Drouvot wrote:
> > > >> Yeah, people would likely use this new field to monitor long running
> > > >> vacuum.
> > > >> Long enough that this error should be acceptable. Do you agree?
> > > >
> > > > OTOH, adding the 100% accuracy looks as simple as v9-0002 attached
> > > > (0001 is
> > > > same as for v8), so I think we should provide it.
> > >
> > This Idea looks good to me.
>
> Thanks for looking at it!
>
> > 1.
> > +       Total amount of time spent in milliseconds waiting due to
> > <xref linkend="guc-vacuum-cost-delay"/>
> > +       or <xref linkend="guc-autovacuum-vacuum-cost-delay"/>. In case
> > of parallel
> > +       vacuum the reported time is across all the workers and the leader. The
> > +       workers update the column no more frequently than once per second, so it
> > +       could show slightly old values.
> > +      </para></entry>
> >
> > I think this waiting is influenced due to cost delay as well as cost
> > limit GUCs because here we are counting total wait time and that very
> > much depends upon how frequently we are waiting and that's completely
> > driven by cost limit. Isn't it?
>
> Yeah. I think we could change the wording that way:
>
> s/waiting due to/waiting during/
>
> Does that make sense? I don't think we need to mention cost limit here.

Yeah that should be fine.

> > 2.
> > + if (IsParallelWorker())
> > + {
> > + instr_time time_since_last_report;
> > +
> > + INSTR_TIME_SET_ZERO(time_since_last_report);
> > + INSTR_TIME_ACCUM_DIFF(time_since_last_report, delay_end,
> > +   last_report_time);
> > + nap_time_since_last_report += INSTR_TIME_GET_MILLISEC(delayed_time);
> > +
> > + if (INSTR_TIME_GET_MILLISEC(time_since_last_report) >
> > WORKER_REPORT_DELAY_INTERVAL)
> > + {
> > + pgstat_progress_parallel_incr_param(PROGRESS_VACUUM_TIME_DELAYED,
> > + nap_time_since_last_report);
> > + nap_time_since_last_report = 0;
> > + last_report_time = delay_end;
> > + }
> > + }
> >
> > Does it make sense to track this "nap_time_since_last_report" in a
> > shared variable instead of local in individual workers and whenever
> > the shared variable crosses WORKER_REPORT_DELAY_INTERVAL we can report
> > this would avoid individual reporting from different workers.  Since
> > we are already computing the cost balance in shared variable i.e.
> > VacuumSharedCostBalance, or do you think it will complicate the code?
> >
>
> I'm not sure I follow. nap_time_since_last_report is the time a worker had to
> wait since its last report. There is no direct relationship with
> WORKER_REPORT_DELAY_INTERVAL (which is compared to time_since_last_report and
> not nap_time_since_last_report).

I mean currently we are tracking "time_since_last_report" and
accumulating the delayed_time in "nap_time_since_last_report" for each
worker.  So my question was does it make sense to do this in a shared
variable across workers so that we can reduce the number of reports to the
leader?


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Guillaume Lelarge
Date:
Subject: Re: Add a warning message when using unencrypted passwords
Next
From: Peter Eisentraut
Date:
Subject: Re: meson missing test dependencies