RE: [Proposal] Add accumulated statistics for wait event - Mailing list pgsql-hackers

From Yotsunaga, Naoki
Subject RE: [Proposal] Add accumulated statistics for wait event
Date
Msg-id 8E9126CB6CE2CD42962059AB0FBF7B0DC5F6E8@g01jpexmbkw23
Whole thread Raw
In response to RE: [Proposal] Add accumulated statistics for wait event  (Phil Florent <philflorent@hotmail.com>)
List pgsql-hackers

On Mon, Oct 29, 2018 at 1:52 AM, Phil Florent wrote:

 

Hi, thank you for comments.

 

>Yes you will be able to solve bottlenecks with sampling. In interactive mode, a 1s interval is probably too large. I use 0s1 - 0s01 with my tool and it is normally OK.

 

With the tool you are using, can you sample at intervals shorter than 1 second?

If you can, you can get enough sampling number and you can also acquire short events.

 

>Since grafana is now able to connect directly to a postgresql source, I use it to display the information collected from pg_stat_activity and psutil ( e.g https://pgphil.ovh/traqueur_dashboard_02.php page is written in french but panels are in english)

 

It is wonderful to visualize.

Especially for beginners like me.

 

 

>Other DB have accumulated statistics but you can notice that sampling is also their most modern method.

>E.g Oracle DB : 20 years ago you already had tools like "utlbstat/utlestat" . Then you had "statspack". Those tools were based on accumulated statistics and the reports were based on differences between 2 points. It was useful to solve major problems but it was limited and not precise enough in many cases.

 

>The preferred feature to identify bottlenecks in the Oracle world is now ASH (active session history). It can help with major problems, specific problems AND it can identify short blockages.

>Too bad it is licensed as an option of their Enterprise Edition but similar tools exist and they are also based on sampling of the activity.

 

>With the "official" ASH, sampling and archiving are done internally and you have a circular memory zone dedicated to the feature. Hence the overhead is lower but that's all.

 

>The most advanced interactive tool is called "snapper" and it is also based on sampling.

 

Thanks. I will check it.

 

The current bottleneck survey method, from sampling, I can know the number (ratio) of waiting events.

Then, investigate from those with a high number of times (ratio).

Do you agree with this recognition?

 

---------------------------------------

Naoki, Yotsunaga.

pgsql-hackers by date:

Previous
From: Vincent Mirian
Date:
Subject: Hooks to Modify Execution Flow and Query Planner
Next
From: Andrew Gierth
Date:
Subject: Re: PostgreSQL Limits and lack of documentation about them.