Re: RFC: replace pg_stat_activity.waiting with something more descriptive - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date
Msg-id CAPpHfdt373MTh0NvOaTbExWCBbzBBoFKnHGonKPfLp9=fpZEXQ@mail.gmail.com
Whole thread Raw
In response to Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Sep 14, 2015 at 2:12 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Sep 14, 2015 at 2:25 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Sat, Sep 12, 2015 at 2:05 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 6, 2015 at 3:31 PM, Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru> wrote:
>
> On 08/05/2015 09:33 PM, Robert Haas wrote:
>>
>>
>> You're missing the point.  Those multi-byte fields have additional
>> synchronization requirements, as I explained in some detail in my
>> previous email. You can't just wave that away.
>
> I see that now. Thank you for the point.
>
> I've looked deeper and I found PgBackendStatus to be not a suitable
> place for keeping information about low level waits. Really, PgBackendStatus
> is used to track high level information about backend. This is why auxiliary
> processes don't have PgBackendStatus, because they don't have such information
> to expose. But when we come to the low level wait events then auxiliary
> processes are as useful for monitoring as backends are. WAL writer,
> checkpointer, bgwriter etc are using LWLocks as well. This is certainly unclear
> why they can't be monitored.
>

I think the chances of background processes stuck in LWLock is quite less
as compare to backends as they do the activities periodically.  As an example
WALWriter will take WALWriteLock to write the WAL, but actually there will never
be any much contention for WALWriter. In synchronous_commit = on, the
backends themselves write the WAL so WALWriter won't do much in that
case and for synchronous_commit = off, backends won't write the WAL so
WALWriter won't face any contention unless some buffers have to be written
by bgwriter or checkpoint for which WAL is not flushed which I don't think
would lead to any contention. 
 
Hmm, synchronous_commit is per session variable: some transactions could run with synchronous_commit = on, but some with synchronous_commit = off. This is very popular feature of PostgreSQL: achieve better performance by making non-critical transaction asynchronous while leaving critical transactions synchronous. Thus, contention for WALWriteLock between backends and WALWriter could be real.


I think it is difficult to say that can lead to contention due to periodic
nature of WALWriter, but I don't deny that there is chance for
background processes to have contention.

We don't know if there could be contention in advance. This is why we need monitoring.
  
I am not denying from the fact that there could be some contention in rare
scenarios for background processes, but I think tracking them is not as
important as tracking the LWLocks for backends.

I would be more careful in calling some of scenarios rare. As DBMS developers we should do our best to evade contention for LWLocks: any contention, not only between backends and background processes.  One may assume that high LWLock contention is rare scenario in general. Once we're here we doesn't think so, though.
You claims that there couldn't be contention for WALWriteLock between backends and WALWriter. This is unclear for me: I think it could be.

I think there would be more things where background processes could wait
than LWLocks and I think they are important to track, but could be done separately
from tracking them for pg_stat_activity.  Example, we have a pg_stat_bgwriter
view, can't we think of tracking bgwriter/checkpointer wait information in that
view and similarly for other background processes we can track in other views
if any related view exists or create a new one to track for all background processes.
 
Nobody opposes tracking wait events for backends and tracking them for background processes. I think we need to track both in order to provide full picture to DBA.


Sure, that is good to do, but can't we do it separately in another patch.
I think in this patch lets just work for wait_events for backends.

Yes, but I think we should have a design of tracking wait event for every process before implementing this only for backends. 
 
Also as we are planning to track the wait_event information in pg_stat_activity
along with other backends information, it will not make sense to include
information about backend processes in this variable as pg_stat_activity
just displays information of backend processes.

I'm not objecting that we should track only backends information in pg_stat_activity. I think we should have also some other way of tracking wait events for background processes. We should think it out before extending pg_stat_activity to evade design issues later.


I think we  can discuss if you see any specific problems or you want specific
things to be clarified, but sorting out the complete design of waits monitoring
before this patch can extend the scope of this patch beyond need.

I think we need to sort out at least some part of this design: where to store current event information for every process, not only backend. Other way, we can't be sure we're moving towards waits monitoring not backwards.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Next
From: Robert Haas
Date:
Subject: Re: RFC: replace pg_stat_activity.waiting with something more descriptive