Re: Fix pg_stat_get_backend_wait_event() for aux processes - Mailing list pgsql-hackers
| From | Sami Imseih |
|---|---|
| Subject | Re: Fix pg_stat_get_backend_wait_event() for aux processes |
| Date | |
| Msg-id | CAA5RZ0sPYBVqiU3Fxsh5Y8P3r59zikXKW5i16hcruf-utaLQPw@mail.gmail.com Whole thread Raw |
| In response to | Re: Fix pg_stat_get_backend_wait_event() for aux processes (Andres Freund <andres@anarazel.de>) |
| List | pgsql-hackers |
> > > There is also a discussion [0] about wait event/activity field > > > inconsistency > > > with pg_stat_activity with a repro in [1]. > > > > > > The repro I was referring to in [1] is actually > > https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru > > That is inherent. The wait event is updated in an unsynchronized fashion. As > noted in that thread. > > Making it synchronized (via st_changecount) would make wait event overhead > vastly higher. Correct, and I don't think we should pay the overhead of ensuring strict synchronization. The patch that Heikkei is proposing [0] will update wait_event_info bypassing st_changecount. ``` +++ b/src/include/utils/backend_status.h @@ -174,6 +174,15 @@ typedef struct PgBackendStatus /* plan identifier, optionally computed using planner_hook */ int64 st_plan_id; + + /* + * Proc's wait information. This *not* protected by the changecount + * mechanism, because reading and writing an uint32 is assumed to atomic. + * This is updated very frequently, so we want to keep the overhead as + * small as possible. + */ + uint32 st_wait_event_info; + ``` Which is the same assumption we are already making. ``` static inline void pgstat_report_wait_start(uint32 wait_event_info) { /* * Since this is a four-byte field which is always read and written as * four-bytes, updates are atomic. */ *(volatile uint32 *) my_wait_event_info = wait_event_info; } ``` So without st_changecount, wait_events could still be out of sync on with other backend status fields (i.e. state), but at least my testing shows better results with [0] applied. Better results here are no samples with " state = active, wait_event = ClientRead and backend_type = client backend" Using the attached repro.sh (attached): ## without patch ``` # of samplase of state = active, wait_event = ClientRead and backend_type = client backend 238 # of sampled of state = active + wait_event 11774 transactionid 4054 tuple 489 CPU 444 LockManager 343 WALWrite 238 ClientRead 69 WalSync 15 BufferExclusive 2 VacuumDelay 2 BufferShared 1 DataFileWrite ``` ## with patch [0] ``` # of samplase of state = active, wait_event = ClientRead and backend_type = client backend 0 # of sampled of state = active + wait_event 12516 transactionid 3188 tuple 504 WALWrite 414 CPU 411 LockManager 86 WalSync 7 BufferExclusive 4 BufferShared 2 VacuumDelay ``` Also, it's worth noting that we bypass st_changecount for VACUUM + ANALYZE delay time. This was discussed in [2] vacuumlazy.c: ``` if (track_cost_delay_timing) { /* * We bypass the changecount mechanism because this value is * only updated by the calling process. */ appendStringInfo(&buf, _("delay time: %.3f ms\n"), (double) MyBEEntry->st_progress_param[PROGRESS_ANALYZE_DELAY_TIME] / 1000000.0); } ``` commands/analyze.c: ``` if (track_cost_delay_timing) { /* * We bypass the changecount mechanism because this value is * only updated by the calling process. We also rely on the * above call to pgstat_progress_end_command() to not clear * the st_progress_param array. */ appendStringInfo(&buf, _("delay time: %.3f ms\n"), (double) MyBEEntry->st_progress_param[PROGRESS_VACUUM_DELAY_TIME] / 1000000.0); } ``` [0] https://www.postgresql.org/message-id/459e78c0-927f-4347-86df-ca431567c95a%40iki.fi [1] https://www.postgresql.org/message-id/ab1c0a7d-e789-5ef5-1180-42708ac6fe2d%40postgrespro.ru [2] https://www.postgresql.org/message-id/Z6-tiXbLwHYyDeNy%40nathan -- Sami Imseih Amazon Web Services (AWS)
Attachment
pgsql-hackers by date: