Re: RFC: replace pg_stat_activity.waiting with something more descriptive - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Date
Msg-id CAHGQGwFjQ3pmv8Yeknxtz4G=ntZRqP6NHwrqkcSGpRQufaboJA@mail.gmail.com
Whole thread Raw
In response to Re: RFC: replace pg_stat_activity.waiting with something more descriptive  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: RFC: replace pg_stat_activity.waiting with something more descriptive
List pgsql-hackers
On Fri, Jun 26, 2015 at 12:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Jun 25, 2015 at 9:23 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> On 6/22/15 1:37 PM, Robert Haas wrote:
>>> Currently, the only time we report a process as waiting is when it is
>>> waiting for a heavyweight lock.  I'd like to make that somewhat more
>>> fine-grained, by reporting the type of heavyweight lock it's awaiting
>>> (relation, relation extension, transaction, etc.).  Also, I'd like to
>>> report when we're waiting for a lwlock, and report either the specific
>>> fixed lwlock for which we are waiting, or else the type of lock (lock
>>> manager lock, buffer content lock, etc.) for locks of which there is
>>> more than one.  I'm less sure about this next part, but I think we
>>> might also want to report ourselves as waiting when we are doing an OS
>>> read or an OS write, because it's pretty common for people to think
>>> that a PostgreSQL bug is to blame when in fact it's the operating
>>> system that isn't servicing our I/O requests very quickly.
>>
>> Could that also cover waiting on network?
>
> Possibly.  My approach requires that the number of wait states be kept
> relatively small, ideally fitting in a single byte.  And it also
> requires that we insert pgstat_report_waiting() calls around the thing
> that is notionally blocking.  So, if there are a small number of
> places in the code where we do network I/O, we could stick those calls
> around those places, and this would work just fine.  But if a foreign
> data wrapper, or any other piece of code, does network I/O - or any
> other blocking operation - without calling pgstat_report_waiting(), we
> just won't know about it.

Probably Itagaki-san's very similar proposal and patch would be useful
to consider what wait events to track.
http://www.postgresql.org/message-id/20090309125146.913C.52131E4D@oss.ntt.co.jp

According to his patch, the wait events that he was thinking to add were:

+ typedef enum PgCondition
+ {
+     PGCOND_UNUSED                = 0,        /* unused */
+
+     /* 10000 - CPU */
+     PGCOND_CPU                    = 10000,    /* generic cpu operations */
+     /* 11000 - CPU:PARSE */
+     PGCOND_CPU_PARSE            = 11000,    /* pg_parse_query */
+     PGCOND_CPU_PARSE_ANALYZE    = 11100,    /* parse_analyze */
+     /* 12000 - CPU:REWRITE */
+     PGCOND_CPU_REWRITE            = 12000,    /* pg_rewrite_query */
+     /* 13000 - CPU:PLAN */
+     PGCOND_CPU_PLAN                = 13000,    /* pg_plan_query */
+     /* 14000 - CPU:EXECUTE */
+     PGCOND_CPU_EXECUTE            = 14000,    /* PortalRun or
PortalRunMulti */
+     PGCOND_CPU_TRIGGER            = 14100,    /* ExecCallTriggerFunc */
+     PGCOND_CPU_SORT                = 14200,    /* (generic sort operation) */
+     PGCOND_CPU_SORT_HEAP        = 14210,    /* tuplesort_begin_heap */
+     PGCOND_CPU_SORT_INDEX        = 14220,    /* tuplesort_begin_index_btree */
+     PGCOND_CPU_SORT_DATUM        = 14230,    /* tuplesort_begin_datum */
+     /* 15000 - CPU:UTILITY */
+     PGCOND_CPU_UTILITY            = 15000,    /* ProcessUtility */
+     PGCOND_CPU_COMMIT            = 15100,    /* CommitTransaction */
+     PGCOND_CPU_ROLLBACK            = 15200,    /* AbortTransaction */
+     /* 16000 - CPU:TEXT */
+     PGCOND_CPU_TEXT                = 16000,    /* (generic text operation) */
+     PGCOND_CPU_DECODE            = 16100,    /* pg_client_to_server */
+     PGCOND_CPU_ENCODE            = 16200,    /* pg_server_to_client */
+     PGCOND_CPU_LIKE                = 16310,    /* GenericMatchText */
+     PGCOND_CPU_ILIKE            = 16320,    /* Generic_Text_IC_like */
+     PGCOND_CPU_RE                = 16400,    /* (generic regexp operation) */
+     PGCOND_CPU_RE_COMPILE        = 16410,    /* RE_compile_and_cache */
+     PGCOND_CPU_RE_EXECUTE        = 16420,    /* RE_execute */
+
+     /* 20000 - NETWORK */
+     PGCOND_NETWORK                = 20000,    /* (generic network
operation) */
+     PGCOND_NETWORK_RECV            = 21000,    /* secure_read */
+     PGCOND_NETWORK_SEND            = 22000,    /* secure_write */
+
+     /* 30000 - IDLE (should be larger than network to distinguish
idle or recv) */
+     PGCOND_IDLE                    = 30000,    /* <IDLE> */
+     PGCOND_IDLE_IN_TRANSACTION    = 31000,    /* <IDLE> in transaction */
+     PGCOND_IDLE_SLEEP            = 32000,    /* pg_usleep */
+
+     /* 40000 - XLOG */
+     PGCOND_XLOG                    = 40000,    /* (generic xlog operation) */
+     PGCOND_XLOG_CRC                = 41000,    /* crc calculation in
XLogInsert */
+     PGCOND_XLOG_INSERT            = 42000,    /* insert in XLogInsert */
+     PGCOND_XLOG_OPEN            = 43000,    /* XLogFileOpen */
+     PGCOND_XLOG_CLOSE            = 44000,    /* XLogFileClose */
+     PGCOND_XLOG_WRITE            = 45000,    /* write in XLogWrite */
+     PGCOND_XLOG_FLUSH            = 46000,    /* issue_xlog_fsync */
+
+     /* 50000 - DATA */
+     PGCOND_DATA                    = 50000,    /* (generic data operation) */
+     PGCOND_DATA_CREATE            = 51000,    /* smgrcreate */
+     PGCOND_DATA_OPEN            = 52000,    /* smgropen */
+     PGCOND_DATA_CLOSE            = 53000,    /* smgrclose */
+     PGCOND_DATA_STAT            = 54000,    /* smgrnblocks */
+     PGCOND_DATA_READ            = 55000,    /* smgrread */
+     PGCOND_DATA_PREFETCH        = 56000,    /* smgrprefetch */
+     PGCOND_DATA_WRITE            = 57000,    /* smgrwrite */
+     PGCOND_DATA_EXTEND            = 58000,    /* smgrextend */
+
+     /* 60000 - TEMP */
+     PGCOND_TEMP                    = 60000,    /* (generic temp file
operation) */
+     PGCOND_TEMP_READ            = 61000,    /* BufFileRead */
+     PGCOND_TEMP_WRITE            = 62000,    /* BufFileWrite */
+
+     /* 70000 - LOCK */
+     PGCOND_LOCK                    = 70000,    /* waiting on a lmgr lock */
+     /* 70001-70999 is reserved for lmgr locks */
+
+     /* 80000 - LWLOCK */
+     PGCOND_LWLOCK                = 80000,    /* waiting on a generic lwlock */
+     /* 80001-80999 is reserved for named lwlocks */
+     PGCOND_LWLOCK_BUFMAPPING    = 81000,    /* BufMappingLock(s) */
+     PGCOND_LWLOCK_LOCKMGR        = 82000,    /* LockMgrLock(s) */
+     PGCOND_LWLOCK_PAGE            = 83000,    /* BufferDesc.content_lock */
+     PGCOND_LWLOCK_IO            = 84000,    /*
BufferDesc.io_in_progress_lock */
+
+     /* 90000 - SPINLOCK */
+     PGCOND_SPINLOCK                = 90000        /* timeout in s_lock */
+ } PgCondition;

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: dblink: add polymorphic functions.
Next
From: Corey Huinker
Date:
Subject: Re: dblink: add polymorphic functions.