Re: [Proposal] Add accumulated statistics for wait event - Mailing list pgsql-hackers
From | Jehan-Guillaume de Rorthais |
---|---|
Subject | Re: [Proposal] Add accumulated statistics for wait event |
Date | |
Msg-id | 20210605005344.4a751fe5@firost Whole thread Raw |
In response to | Re: [Proposal] Add accumulated statistics for wait event (Daniel Gustafsson <daniel@yesql.se>) |
Responses |
Re: [Proposal] Add accumulated statistics for wait event
|
List | pgsql-hackers |
Hi All, I faced a few times a situation where a long running query is actually including the time the backend is waiting for the frontend to fetch all the rows (see [1] for details). See a sample code fe-time.c and its comments in attachment to reproduce this behavior. There's no simple way today to pinpoint the problem in production without advance interactive auditing, and/or using system tools. After studying the problem, I believe it boil down to track the wait event ClientWrite, so I ended up on this thread. You might catch some mismatching times thanks to auto_explain as well. Using the fe-time.c demo code with the following command: PGDATABASE=postgres PGHOST=::1 time ./fe-time 100 The frontend time is 10s, the query time reported is 3228.631ms, but last row has been produced after 20.672ms: LOG: duration: 3228.631 ms plan: Query Text: SELECT * FROM pgbench_accounts Seq Scan on pgbench_accounts (time=0.005..20.672 rows=100000 loops=1) (Note that in contrast with localhost, through the unix socket, the backend reported query time is always really close to 10s). I re-based the existing patch (see in attachment), to look at the ClientWrite for this exact query: # SELECT wait_event, calls, times FROM pg_stat_get_waitaccum(NULL) WHERE wait_event = 'ClientWrite'; wait_event | calls | times -------------+-------+--------- ClientWrite | 4 | 3132266 The "time" is expressed as µs in the patch, so 3132.266ms of the total 3228.631ms is spent sending the result to the frontend. I'm not sure where are the missing 75ms. The pg_wait_sampling extension might help but it requires a production restart to install, then enable it. Whatever if the solution is sampling or cumulative, an in-core and hot-switchable solution would be much more convenient. But anyway, looking at pg_wait_sampling, we have a clear suspect as well for the later query run: # SELECT event, count FROM pg_wait_sampling_profile WHERE queryid=4045741516911800313; event | count -------------+------- ClientWrite | 309 The default profil period of pg_wait_sampling being 10ms, we can roughly estimate the ClientWrite around 3090ms. Note that this is close enough because we know 3132266µs has been accumulated among only 4 large wait events. Finishing bellow. On Mon, 3 Aug 2020 00:00:40 +0200 Daniel Gustafsson <daniel@yesql.se> wrote: > > On 31 Jul 2020, at 07:23, imai.yoshikazu@fujitsu.com wrote: > > > >> This patch fails to apply to HEAD, please submit a rebased version. I've > >> marked this as as Waiting on Author. Please, find in attachment a rebase of both patches. I did some small editing on the way. I didn't bench them. I'm not sure this patch is the best approach though. Receive it as a motivation to keep up with this discussion. As I wrote, whatever if the solution is sampling or cumulative, an in-core and hot-switchable solution would be much more convenient. The fact is that this patch was already available and ready to keep up with a discussion. Collecting and summing all wait events from all backends in the same place forbid to track precisely wait events from a specific backends. Especially on a busy system where numbers can quickly be buried by all other activities around. I wonder if wait events should only be accumulated on backend side, making possible to enable/disable them on the fly and to collect some reports eg. in logs or to output. Most of the code from these patch could be recycled in a simpler patch implementing this. Thoughts? > > Sorry for my absence. Unfortunately I couldn't have time to work on this > > patch in this cf. I believe I will be back in next cf, work on this patch > > and also review other patches. > > No worries, it happens. Since the thread has stalled and there is no updated > patch I've marked this entry Returned with Feedback. Please feel free to > re-open a new CF entry if you return to this patch. I volunteer to be a reviewer on this patch. Imai-san, do you agree to add it as new CF entry? Regards, [1]: Last time I had such situation was few weeks ago. A customer was reporting a query being randomly slow, running bellow 100ms most of the time and sometime hitting 28s. Long story short, the number of row was the same (10-15k), but the result set size was 9x bigger (1MB vs 9MB). As the same query was running fine from psql, I suspected the frontend was somehow saturated. Tcpdump helped me to compute that the throughput fall to 256kB/s after the first 2MB of data transfert with a very narrow TCP window. I explained to the customer their app probably doesn't pull the rows fast enough and that some buffers were probably saturated on the frontend side, waiting for the app and slowing down the whole transfert. Devels fixed the problem by moving away two fields transformations (unaccent) from their loop fetching the rows.
Attachment
pgsql-hackers by date: