Re: [HACKERS] pg_stat_activity.waiting_start - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: [HACKERS] pg_stat_activity.waiting_start
Date
Msg-id CAASwCXeEd5mqtq15gww5jQWREA+np3zxFgm5SpTJBkp_3eW5vg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] pg_stat_activity.waiting_start  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] pg_stat_activity.waiting_start  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Dec 25, 2016 at 8:01 PM, Andres Freund <andres@anarazel.de> wrote:
> On December 25, 2016 1:21:43 AM GMT+01:00, Joel Jacobson <joel@trustly.com> wrote:
>
>>Is it really a typical real-life scenario that processes can be
>>waiting extremely often for extremely short periods of time,
>>where the timing overhead would be significant?
>
> Yes. Consider WAL insertion, procarray or other similar contended locks.

Ah, I see, then I understand it has to be blazingly fast.

Maybe a good tradeoff then would be to let "wait_start" represent the
very first time the txn started waiting?
That way gettimeofday() would only be called once per txn, and the
value would be remembered, but not exposed when the txn is not
waiting.
If the txn is waiting/not waiting multiple times during it's
life-time, the same "wait_start" value would be exposed when it's
waiting, and NULL when it's not. Sounds good?

As long as the documentation is clear on "wait_start" meaning when the
first wait started in the txn, I think that's useful enough to improve
the situation, as one could then ask a query like "select all
processes that have possibly been waiting for at least 5 seconds",
which you cannot do today.

The best you can do today is ask a query like "select all processes
that are waiting and have been running for at least 5 seconds", but
during those 5 seconds they have been running, they might only have
been waiting for the very last few milliseconds, which might not be a
problem at all. If instead knowing they were waiting 5 seconds ago,
and are still waiting, but might have had periods in between where
they were not waiting, I would say that is close enough to what I as a
user want to know, and can use that information for automatic
decision-making on e.g. if I want to terminate other blocking
processes.



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [HACKERS] Cluster wide option to control symbol case folding
Next
From: Jaime Casanova
Date:
Subject: Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)