Re: RFC: replace pg_stat_activity.waiting with something more descriptive - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date
Msg-id CAA4eK1LghaZOOT8RRBjgVSqka1xjup_NTbKm2SWpM5yBN66fPw@mail.gmail.com
Whole thread Raw
In response to Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>)
List pgsql-hackers
On Thu, Jun 25, 2015 at 6:10 PM, Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
>
> On Thu, Jun 25, 2015 at 1:49 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> Personally I think, that tracking waits is a not a good idea for
> >> pg_stat_activity (at least in that straight-forward manner).
> >
> > As mentioned in the initial mail by Robert, that sometimes system becomes
> > slow (either due to contention on various kinds of locks or due to I/O or
> > due
> > to some other such reasons) that such kind of handy information via some
> > view is quite useful.  Recently while working on one of the
> > performance/scalability
> > projects, I need to use gdb to attach to different processes to see what
> > they
> > are doing (of course one can use perf or some other utilities as well) and I
> > found most of them were trying to wait on some LW locks, now having such
> > an information available via view could be really useful, because sometimes
> > at customer sites, we can't use gdb or perf to see what's going on.
>
> Yes, I understand such a use-case. But I hardly see if suggested
> design can help for such cases.
>
> Basically, a DBA has two reasons to take a look on waits:
>
> 1. Long response time for particular query (or some type of queries).
> In that case it is good to know how much time we spend on waiting for
> particular resources we need to get query results
> 2. Overall bad performance of a database. We know, that something goes
> wrong and consumes resources, we need to identify which backend, which
> query causes the most of waits.
>
> In both cases we need a) some historical data rather than simple
> snapshot b) some approach how to aggregate it  because the will be
> certainly a lot of events
>

I think this thread's proposal will help for cases, when user/DBA wants to
see where currently database is spending most time (during waits).

I understand that there is a use of historical information which can
be helpful for the kind of cases which you have explained above.  

>
> I think it is more than possible to mix both approaches. My proof of
> concept now is only about LWLocks - yours and Robert's is more
> general, and certainly some wait event classification will be needed
> for both approaches and its much better to implement one rather than
> two different.
>
> And at least, I will be interesting in reviewing your approach.
>

Okay, I am planning to spend time on this patch in coming few days
and when that's ready, may be we can see if that could be useful
for what you are planning to do.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Should we back-patch SSL renegotiation fixes?
Next
From: Peter Eisentraut
Date:
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive