Thread: pg_stat_lwlock wait time view
There was some discussion earlier in adding pg_stat_lwlock view in [1]. The main objections which I observed for that patch was showing LWLOCK information to the user that don't understand what this lock used for and etc. Currently as part of wait_event information in pg_stat_activity the LWLOCK information is available to the user and the details of LWLOCk's that are used in PostgreSQL are also listed in the documentation and with their purpose. So I feel it may be worth to add this view to find out the wait times of the LWLOCK's. This information can be useful to find out the bottlenecks around LWLOCK's in production environments. But adding the timing calculations may cause performance problem. Is there any need of writing this stats information to file? As this just provides the wait time information. Based on the performance impact with the additional timing calculations, we can decide the view default behavior, Are there any objections to the concept? [1] - https://www.postgresql.org/message-id/4FE9A6F5.2080405@uptime.jp Regards, Hari Babu Fujitsu Australia
On Wed, Aug 24, 2016 at 4:23 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > There was some discussion earlier in adding pg_stat_lwlock view in [1]. > The main objections which I observed for that patch was showing LWLOCK > information to the user that don't understand what this lock used for and etc. > > Currently as part of wait_event information in pg_stat_activity the LWLOCK > information is available to the user and the details of LWLOCk's that are > used in PostgreSQL are also listed in the documentation and with their > purpose. > > So I feel it may be worth to add this view to find out the wait times of the > LWLOCK's. This information can be useful to find out the bottlenecks > around LWLOCK's in production environments. But adding the timing calculations > may cause performance problem. Is there any need of writing this stats > information to file? As this just provides the wait time information. > > Based on the performance impact with the additional timing calculations, > we can decide the view default behavior, Are there any objections to the > concept? There have been some other recent threads on extending the wait event stuff. If you haven't already read those, you should, because the issues are closely related. I think that timing LWLock waits will be quite expensive. I believe that what the Postgres Pro folks want to do is add up the wait times or maybe keep a history of waits (though maybe I'm wrong about that), but showing them in pg_stat_activity is another idea. That's likely to add some synchronization overhead which might be even greater in this case than for a feature that just publishes accumulated times, but maybe it's all a drop in the bucket compared to the cost of calling gettimeofday() in the first place. Personally, my preferred solution is still to have a background worker that samples the published wait events and rolls up statistics, but I'm not sure I've convinced anyone else. It could report the number of seconds since it detected a wait event other than the current one, which is not precisely the same thing as tracking the length of the current wait but it's pretty close. I don't know for sure what's best here - I think some experimentation and dialog is needed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Haribabu Kommi > calculations may cause performance problem. Is there any need of writing > this stats information to file? As this just provides the wait time > information. Yes, saving the workload diagnosis information would be nice like Oracle AWR. I mean not the current accumulated total,but a series of snapshots taken periodically. It would enable: * Daily monitoring across database restarts. e.g. The response times of applications degraded after applying a patch. What'sthe difference between before and after the patch application? * Hint on troubleshooting a crash failure. e.g. Excessive memory use by PostgreSQL crashed the OS. What was the workloadlike just before the crash? The point of discussion may be whether PostgreSQL itself provides the feature to accumulate performance diagnosis informationon persistent storage. pg_statsinfo will be able to take on it, but it wouldn't be convenient nor efficient. Regards Takayuki Tsunakawa
On Thu, Aug 25, 2016 at 6:57 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Aug 24, 2016 at 4:23 AM, Haribabu Kommi > <kommi.haribabu@gmail.com> wrote: >> >> Based on the performance impact with the additional timing calculations, >> we can decide the view default behavior, Are there any objections to the >> concept? > > There have been some other recent threads on extending the wait event > stuff. If you haven't already read those, you should, because the > issues are closely related. I think that timing LWLock waits will be > quite expensive. I believe that what the Postgres Pro folks want to > do is add up the wait times or maybe keep a history of waits (though > maybe I'm wrong about that), but showing them in pg_stat_activity is > another idea. That's likely to add some synchronization overhead > which might be even greater in this case than for a feature that just > publishes accumulated times, but maybe it's all a drop in the bucket > compared to the cost of calling gettimeofday() in the first place. Yes, I agree this is an issue for the cases where the wait time is smaller than the logic that is added to calculate the wait time. Even if we use clock_gettime with CLOCK_REALTIME_COARSE there will be some overhead, as this clock method is 8 times faster than gettimeofday but not that accurate in result. May be we can use the clock_getime instead of gettimeofday in this case, as we may not needed the fine-grained value. > Personally, my preferred solution is still to have a background worker > that samples the published wait events and rolls up statistics, but > I'm not sure I've convinced anyone else. It could report the number > of seconds since it detected a wait event other than the current one, > which is not precisely the same thing as tracking the length of the > current wait but it's pretty close. I don't know for sure what's best > here - I think some experimentation and dialog is needed. Yes, using of background worker can reduce the load of adding all the wait time calculations in the main backend. I can give a try by modifying direct calculation approach and background worker (may be pg_stat_collector) to find the wait time based on the stat messages that are received from main backend related to wait start and wait end. I am not sure with out getting any signal or message from main backend, how much accurate the data can be gathered from a background worker. Regards, Hari Babu Fujitsu Australia
2016-08-25 13:46 GMT+09:00 Haribabu Kommi <kommi.haribabu@gmail.com>: > On Thu, Aug 25, 2016 at 6:57 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Aug 24, 2016 at 4:23 AM, Haribabu Kommi >> <kommi.haribabu@gmail.com> wrote: >>> >>> Based on the performance impact with the additional timing calculations, >>> we can decide the view default behavior, Are there any objections to the >>> concept? >> >> There have been some other recent threads on extending the wait event >> stuff. If you haven't already read those, you should, because the >> issues are closely related. I think that timing LWLock waits will be >> quite expensive. I believe that what the Postgres Pro folks want to >> do is add up the wait times or maybe keep a history of waits (though >> maybe I'm wrong about that), but showing them in pg_stat_activity is >> another idea. That's likely to add some synchronization overhead >> which might be even greater in this case than for a feature that just >> publishes accumulated times, but maybe it's all a drop in the bucket >> compared to the cost of calling gettimeofday() in the first place. > > Yes, I agree this is an issue for the cases where the wait time is smaller > than the logic that is added to calculate the wait time. Even if we use > clock_gettime with CLOCK_REALTIME_COARSE there will be some > overhead, as this clock method is 8 times faster than gettimeofday > but not that accurate in result. May be we can use the clock_getime > instead of gettimeofday in this case, as we may not needed the fine-grained > value. Is there any other option (rather than gettimeofday()) to measure elapsed time with lower overhead? I've heard about the RDTSC feature (hardware counter) supported by the recent processors, and have found a few articles [1] [2] on its lower overhead than gettimeofday(). [1] http://stackoverflow.com/questions/15623343/using-cpu-counters-versus-gettimeofday [2] http://stackoverflow.com/questions/6498972/faster-equivalent-of-gettimeofday I'm not sure how we can benefit from it so far, because I'm not familiar with this facility and of course I don't have the numbers. In addition to that, I guess it would bring some portability issues. But I'm still curious to know more about these stuff. Anyone has some experiences on it? >> Personally, my preferred solution is still to have a background worker >> that samples the published wait events and rolls up statistics, but >> I'm not sure I've convinced anyone else. It could report the number >> of seconds since it detected a wait event other than the current one, >> which is not precisely the same thing as tracking the length of the >> current wait but it's pretty close. I don't know for sure what's best >> here - I think some experimentation and dialog is needed. > > Yes, using of background worker can reduce the load of adding all the > wait time calculations in the main backend. I can give a try by modifying > direct calculation approach and background worker (may be pg_stat_collector) > to find the wait time based on the stat messages that are received from > main backend related to wait start and wait end. > > I am not sure with out getting any signal or message from main backend, > how much accurate the data can be gathered from a background worker. It looks a sort of accuracy-performance trade-off. So, I think the use-cases matter here to get a better design. I guess that's the reason why llya is looking for feature requests from DBA in another thread [3]. [3] https://www.postgresql.org/message-id/CAG95seUAQVj09KzLwU%2Bz1B-GqdMqerzEkPFR3hn0q88XzMq-PA@mail.gmail.com Regards, -- Satoshi Nagayasu <snaga@uptime.jp>
[ Re sending to Hackers as the earlier mail failed to deliver to Hackers mailing list]
On Mon, Jan 9, 2017 at 4:13 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Thu, Aug 25, 2016 at 2:46 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:On Thu, Aug 25, 2016 at 6:57 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> Personally, my preferred solution is still to have a background worker
> that samples the published wait events and rolls up statistics, but
> I'm not sure I've convinced anyone else. It could report the number
> of seconds since it detected a wait event other than the current one,
> which is not precisely the same thing as tracking the length of the
> current wait but it's pretty close. I don't know for sure what's best
> here - I think some experimentation and dialog is needed.
Yes, using of background worker can reduce the load of adding all the
wait time calculations in the main backend. I can give a try by modifying
direct calculation approach and background worker (may be pg_stat_collector)
to find the wait time based on the stat messages that are received from
main backend related to wait start and wait end.
I am not sure with out getting any signal or message from main backend,
how much accurate the data can be gathered from a background worker.
Apologies to come back to an old thread.
I tried of using "stats collector process" as a background worker to calculate
the wait times for LWLocks instead of adding another background worker
for proof of concept.
Created two hash tables, one is to store the "LWLock stats" and another
is to store the "Backend's information" with PID as a key.
Whenever the Backend is waiting for an LWLock, it sends the message to
"stats collector" with PID and wait_event_info of the lock. Once the stats
collector receives the message, Adds that Backend entry to Hash table after
getting the start time. Once the Backend ends the waiting for the Lock, it
sends the signal to the "stats collector" and it gets the entry from Hash table
and finds out the wait time and update this time to the corresponding LWLock
entry in another Hash table.
The LWLock wait stats are stored in the stats file for persistence.
Currently no stats reset logic.
This patch is helpful in creating a view to display wait times of all wait events
that is discussed in [1].
Comments?
Regards,
Hari Babu
Fujitsu Australia
On Mon, Jan 9, 2017 at 12:13 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > Whenever the Backend is waiting for an LWLock, it sends the message to > "stats collector" with PID and wait_event_info of the lock. Once the stats > collector receives the message, Adds that Backend entry to Hash table after > getting the start time. Once the Backend ends the waiting for the Lock, it > sends the signal to the "stats collector" and it gets the entry from Hash > table > and finds out the wait time and update this time to the corresponding LWLock > entry in another Hash table. I will be extremely surprised if this doesn't have a severe negative impact on performance when LWLock contention is high (e.g. a pgbench read-only test using a scale factor that fits in the OS cache but not shared_buffers). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company