Re: Waits monitoring - Mailing list pgsql-hackers

From Ildus Kurbangaliev
Subject Re: Waits monitoring
Date
Msg-id 20150907143308.54ba3839@iw
Whole thread Raw
In response to Re: Waits monitoring  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Waits monitoring  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, 7 Sep 2015 08:58:15 +0530
Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Sun, Sep 6, 2015 at 5:58 PM, Andres Freund <andres@anarazel.de>
> wrote:
> >
> > On 2015-09-04 23:44:21 +0100, Simon Riggs wrote:
> > > I see the need for both current wait information and for
> > > cumulative historical detail.
> > >
> > > I'm willing to wait before reviewing this, but not for more than
> > > 1 more
> CF.
> > >
> > > Andres, please decide whether we should punt to next CF now,
> > > based upon other developments. Thanks
> >
> > I think we can do some of the work concurrently - the whole lwlock
> > infrastructure piece is rather independent and what currently most
> > of the arguments are about. I agree that the actual interface will
> > need to be coordinated.
> >
> > Ildus, could you please review Amit & Robert's patch?
> >
> 
> Are you talking about patch where I have fixed few issues in Robert's
> patch [1] or the original patch [3] written by me.
> 
> IIUC, this is somewhat different than what Ildus is doing in his
> latest patch [2].
> 
> Sorry, but I think there is some confusion about that patch [1]
> development. Let me try to summarize what I think has happened and
> why I feel there is some confusion.  Initially Robert has proposed
> the idea of having a column in pg_stat_activity for wait_event on
> hackers and then I wrote an initial patch so that we can discuss the
> same in a more meaningful way and wanted to extend that patch based
> on consensus and what any other patch like Waits monitoring would
> need from it.  In-between Ildus has proposed
> Waits monitoring patch and also started hacking the other
> pg_stat_activity patch and that was the starting point of confusion.
> Now I think that the current
> situation is there's one patch [1] of Robert (with some fixes by
> myself) for standardising
> LWLock stuff, so that we can build pg_stat_activity patch on top of
> it and then
> a patch [2] from Ildus for doing something similar but I think it
> hasn't used Robert's
> patch.
> 
> I think the intention of having multiple people develop same patch is
> to get the work done faster, but I think here it is causing confusion
> and I feel that
> is one reason the patch is getting dragged as well.  Let me know your
> thoughts
> about what is the best way to proceed?
> 
> [1] -
> http://www.postgresql.org/message-id/CAA4eK1KdeC1Tm5ya9gkV85Vtn4qqsRxzKJrU-tu70G_tL1xkFA@mail.gmail.com
> [2] -
> http://www.postgresql.org/message-id/3F71DA37-A17B-4961-9908-016E6323E612@postgrespro.ru
> [3] -
> http://www.postgresql.org/message-id/CAA4eK1Kt2e6XhViGisR5o1aC9NfO0j2wTb8N0ggD1_JkLdeKdQ@mail.gmail.com
> 
> P.S. - This mail is not to point anything wrong with any particular
> individual,
> rather about the development of a particular patch getting haphazard
> because of some confusion.  I am not sure that this is the right
> thread to write about
> it, but as it has been asked here to review the patch in other related
> thread,
> so I have mentioned my thoughts on the same.

About [1] and [2]. They are slightly conflicting, but if [1] is
going to be committed I can easily use it in [2]. And it will simplify
my patch, so I have no objections here. And the same time [3] can be
significantly simplified and improved on top of [1] and [2].

For example this code from [3]:

static void
LWLockReportStat(LWLock *lock)
{int            lockid;const char    *tranche_name;
tranche_name = T_NAME(lock);
if (strcmp(tranche_name, "main") == 0){    lockid = T_ID(lock);
    if (lockid < NUM_INDIVIDUAL_LWLOCKS)        pgstat_report_wait_event(LWLockTypeToWaitEvent(lockid));    else if
(lockid>= BUFFER_MAPPING_LWLOCK_OFFSET &&             lockid <  LOCK_MANAGER_LWLOCK_OFFSET)
pgstat_report_wait_event(WaitEvent_BufferMappingLock);   else if (lockid >= LOCK_MANAGER_LWLOCK_OFFSET &&
lockid<    PREDICATELOCK_MANAGER_LWLOCK_OFFSET)    pgstat_report_wait_event(WaitEvent_LockManagerLock);    else if
(lockid>=    PREDICATELOCK_MANAGER_LWLOCK_OFFSET&& lockid <    NUM_FIXED_LWLOCKS)
pgstat_report_wait_event(WaitEvent_PredicateManagerLock);   else pgstat_report_wait_event(WaitEvent_OtherLWLock); }else
if(strcmp(tranche_name, "WALInsertLocks") == 0)    pgstat_report_wait_event(WaitEvent_WALInsertLocks);else if
(strcmp(tranche_name,"ReplicationOrigins") == 0)    pgstat_report_wait_event(WaitEvent_ReplicationOrigins);else
pgstat_report_wait_event(WaitEvent_OtherTrancheLWLock);
}

can be changed to something like:

#define LWLOCK_WAIT_ID(lock) \(lock->tranche == 0? T_ID(lock) : lock->tranche +NUM_INDIVIDUAL_LWLOCKS)

static void
LWLockReportStat(LWLock *lock)
{int offset = <pgstat offset for lwlocks>;pgstat_report_wait_event(offset + LWLOCK_WAIT_ID(lock));
}

So it will almost not affect to performance of LWLockAcquire.

----
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
<http://www.postgrespro.com/> The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Fwd: [Snowball-discuss] New website
Next
From: Magnus Hagander
Date:
Subject: Re: Fwd: [Snowball-discuss] New website