Hi Andres,
On Mon, 14 Jun 2021 15:01:14 -0700
Andres Freund <andres@anarazel.de> wrote:
> On 2021-06-14 23:20:47 +0200, Jehan-Guillaume de Rorthais wrote:
> > > On 2021-06-14 16:10:32 +0200, Jehan-Guillaume de Rorthais wrote:
> > > > In the patch in attachment, I tried to fix this by using kind of an
> > > > internal hook for pgstat_report_wait_start and pgstat_report_wait_end.
> > > > This allows to "instrument" wait events only when required, on the fly,
> > > > dynamically.
> > >
> > > That's *far worse*. You're adding an indirect function call. Which
> > > requires loading a global variable and then a far call to a different
> > > function. You're changing a path that's ~2 instructions with minimal
> > > dependencies (and no branches (i.e. fully out of order executable) to
> > > something on the order of ~15 instructions with plenty dependencies and
> > > at least two branches (call, ret).
> >
> > Oh, I didn't realized it would affect all queries, even when
> > log_statement_stats was off. Thank you for your explanation.
>
> Maybe I just am misunderstanding what you were doing? As far as I can
> tell your patch changed pgstat_report_wait_start() to be an indirect
> function call - right?
Exact.
I didn't realized this indirection would be so costy on every single calls,
after the variable assignation itself.
> Then yes, this adds overhead to everything.
I understand now, thank you for the explanation.
For my own curiosity and study, I'll remove this indirection and bench my patch
anyway.
> You *could* add a pgstat_report_wait_(start|end)_with_time() or such and
> only use that in places that won't have a high frequency. But I just
> don't quite see the use-case for that.
Well, it could be useful if we decide to only track a subset of wait event.
In my scenario, I originally wanted to only track ClientWrite, but then I
realized this might be too specific and tried to generalize.
There are probably some other way to deal with this issue. Eg.:
* do NOT include the time lost waiting for the frontend side in the query
execution time
* expose the frontend part of the query time in log_min_duration_stmt,
auto_explain, pg_stat_statements, in the same fashion we currently do with
planning and execution time
* having some wait-even sampling mechanism in core or as easy and hot-loadable
than auto_explain
Thoughts?
Thanks again!
Regards,