Re: RFC: replace pg_stat_activity.waiting with something more descriptive - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date
Msg-id CAA4eK1Lps_FOVoO0pHSHokvVhGcUeeE2rsT5BewzjRncfp9O-w@mail.gmail.com
Whole thread Raw
In response to Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>)
Responses Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com>)
List pgsql-hackers
On Thu, Jun 25, 2015 at 4:28 PM, Ilya Kosmodemiansky <ilya.kosmodemiansky@postgresql-consulting.com> wrote:
>
> On Thu, Jun 25, 2015 at 12:37 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > 2. Add 2 new columns to pg_stat_activity
> >     waiting_resource - true for waits other heavy wait locks, false
> >                                otherwise
> >     wait_event - description code for the wait event
> >
> > 3. Add new view 'pg_stat_wait_event' with following info:
> > pid   - process id of this backend
> > waiting - true for any form of wait, false otherwise
> > wait_event_type - Heavy Weight Lock, Light Weight Lock, I/O wait, etc
> > wait_event - Lock (Relation), Lock (Relation Extension), etc
>
> Personally I think, that tracking waits is a not a good idea for
> pg_stat_activity (at least in that straight-forward manner).

As mentioned in the initial mail by Robert, that sometimes system becomes
slow (either due to contention on various kinds of locks or due to I/O or due
to some other such reasons) that such kind of handy information via some
view is quite useful.  Recently while working on one of the performance/scalability
projects, I need to use gdb to attach to different processes to see what they
are doing (of course one can use perf or some other utilities as well) and I
found most of them were trying to wait on some LW locks, now having such
an information available via view could be really useful, because sometimes
at customer sites, we can't use gdb or perf to see what's going on.

>  One
> process can wait for lots of things between 2 sampling of
> pg_stat_activity and that sampling can be pretty useless.
>

Yeah, that's right and I am not sure if we should bother about such scenario's
as the system is generally fine in such situations, however there are other
cases where we can find most of the backends are waiting on one or other
thing.

> My approach (about which Ive had a talk mentioned by Jim and which I
> hope to finalize and submit within a few days) is a bit different and
> I believe is more useful:
>
> 1.  Some sort of histogram of top waits within entire database by pid.
> That will be an approximate one, because  I hardly believe there is a
> possibility to make a precise one without significant overhead.
>
> 2. Some cyclic buffer  of more precise wait statistic inside each
> worker. Sampling may be turned on if we see some issues in histogram
> (1) and want to have some more details.
>

I think this is some what different kind of utility which can give us
aggregated information and I think this will address different kind of
usecase and will have somewhat more complex design and it doesn't
look impossible to use part of what will be developed as part of this
proposal.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Next
From: Andres Freund
Date:
Subject: Re: Should we back-patch SSL renegotiation fixes?