Re: Track the amount of time waiting due to cost_delay - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Track the amount of time waiting due to cost_delay
Date
Msg-id CAD21AoBAXD05prx2VPgi989UOiZmhzFV+PUB=pa86zhObjTCxA@mail.gmail.com
Whole thread Raw
In response to Re: Track the amount of time waiting due to cost_delay  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
List pgsql-hackers
On Mon, Jun 24, 2024 at 7:50 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Sat, Jun 22, 2024 at 12:48:33PM +0000, Bertrand Drouvot wrote:
> > 1. vacuuming indexes time has been longer on master because with v2, the leader
> > has been interrupted 342605 times while waiting, then making v2 "faster".
> >
> > 2. the leader being interrupted while waiting is also already happening on master
> > due to the pgstat_progress_parallel_incr_param() calls in
> > parallel_vacuum_process_one_index() (that have been added in
> > 46ebdfe164). It has been the case "only" 36 times during my test case.
> >
> > I think that 2. is less of a concern but I think that 1. is something that needs
> > to be addressed because the leader process is not honouring its cost delay wait
> > time in a noticeable way (at least during my test case).
> >
> > I did not think of a proposal yet, just sharing my investigation as to why
> > v2 has been faster than master during the vacuuming indexes phase.

Thank you for the benchmarking and analyzing the results! I agree with
your analysis and was surprised by the fact that the more times
workers go to sleep, the more times the leader wakes up.

>
> I think that a reasonable approach is to make the reporting from the parallel
> workers to the leader less aggressive (means occur less frequently).
>
> Please find attached v3, that:
>
> - ensures that there is at least 1 second between 2 reports, per parallel worker,
> to the leader.
>
> - ensures that the reported delayed time is still correct (keep track of the
> delayed time between 2 reports).
>
> - does not add any extra pg_clock_gettime_ns() calls (as compare to v2).
>

Sounds good to me. I think it's better to keep the logic for
throttling the reporting the delay message simple. It's an important
consideration but executing parallel vacuum with delays would be less
likely to be used in practice.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Stepan Neretin
Date:
Subject: Re: Patch bug: Fix jsonpath .* on Arrays
Next
From: Shubham Khanna
Date:
Subject: Re: Pgoutput not capturing the generated columns