Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) |
Date | |
Msg-id | CAAKRu_adqe9YfjkWewrZm8wPhXcXZ6DuFO3jcVCyf9GvvaqWtw@mail.gmail.com Whole thread Raw |
In response to | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
|
List | pgsql-hackers |
On Wed, Sep 8, 2021 at 9:28 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > On Fri, Aug 13, 2021 at 3:08 AM Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > > > On 2021-08-11 16:11:34 -0400, Melanie Plageman wrote: > > > On Tue, Aug 3, 2021 at 2:13 PM Andres Freund <andres@anarazel.de> wrote: > > > > > Also, I'm unsure how writing the buffer action stats out in > > > > > pgstat_write_statsfiles() will work, since I think that backends can > > > > > update their buffer action stats after we would have already persisted > > > > > the data from the BufferActionStatsArray -- causing us to lose those > > > > > updates. > > > > > > > > I was thinking it'd work differently. Whenever a connection ends, it reports > > > > its data up to pgstats.c (otherwise we'd loose those stats). By the time > > > > shutdown happens, they all need to have already have reported their stats - so > > > > we don't need to do anything to get the data to pgstats.c during shutdown > > > > time. > > > > > > > > > > When you say "whenever a connection ends", what part of the code are you > > > referring to specifically? > > > > pgstat_beshutdown_hook() > > > > > > > Also, when you say "shutdown", do you mean a backend shutting down or > > > all backends shutting down (including postmaster) -- like pg_ctl stop? > > > > Admittedly our language is very imprecise around this :(. What I meant > > is that backends would report their own stats up to the stats collector > > when the connection ends (in pgstat_beshutdown_hook()). That means that > > when the whole server (pgstat and then postmaster, potentially via > > pg_ctl stop) shuts down, all the per-connection stats have already been > > reported up to pgstat. > > > > So, I realized that the patch has a problem. I added the code to send > buffer actions stats to the stats collector > (pgstat_send_buffer_actions()) to pgstat_report_stat() and this isn't > getting called when all types of backends exit. > > I originally thought to add pgstat_send_buffer_actions() to > pgstat_beshutdown_hook() (as suggested), but, this is called after > pgstat_shutdown_hook(), so, we aren't able to send stats to the stats > collector at that time. (pgstat_shutdown_hook() sets pgstat_is_shutdown > to true and then in pgstat_beshutdown_hook() (called after), if we call > pgstat_send_buffer_actions(), it calls pgstat_send() which calls > pgstat_assert_is_up() which trips when pgstat_is_shutdown is true.) > > After calling pgstat_send_buffer_actions() from pgstat_report_stat(), it > seems to miss checkpointer stats entirely. I did find that if I > sprinkled pgstat_send_buffer_actions() around in the various places that > pgstat_send_checkpointer() is called, I could get checkpointer stats > (see attached patch, capture_checkpointer_buffer_actions.patch), but, > that seems a little bit haphazard since pgstat_send_buffer_actions() is > supposed to capture stats for all backend types. Is there somewhere else > I can call it that is exercised by all backend types before > pgstat_shutdown_hook() is called but after they would have finished any > relevant buffer actions? > I realized that putting these additional calls in checkpointer code and not clearing out PgBackendStatus counters for buffer actions results in a lot of duplicate stats. I was wondering if pgstat_send_buffer_actions() is needed, however, in HandleCheckpointerInterrupts() before the proc_exit(). It does seem like additional calls to pgstat_send_buffer_actions() shouldn't be needed since most processes register pgstat_shutdown_hook(). However, since MyDatabaseId isn't valid for the auxiliary processes, even though the pgstat_shutdown_hook() is registered from BaseInit(), pgstat_report_stat() never gets called for them, so their stats aren't persisted using the current method. It seems like the best solution to persisting all processes' stats would be to have all processes register pgstat_shutdown_hook() and to still call pgstat_report_stat() even if MyDatabaseId is not valid if the process is not a regular backend (I assume that it is only a problem that MyDatabaseId is InvalidOid for backends that have had it set to a valid oid at some point). For the stats that rely on database OID, perhaps those can be reported based on whether or not MyDatabaseId is valid from within pgstat_report_stat(). I also realized that I am not collecting stats from live auxiliary processes in pg_stat_get_buffer_actions(). I need to change the loop to for (i = 0; i <= MaxBackends + NUM_AUXPROCTYPES; i++) to actually get stats from live auxiliary processes when querying the view. On an unrelated note, I am planning to remove buffers_clean and buffers_checkpoint from the pg_stat_bgwriter view since those are also redundant. When I was removing them, I noticed that buffers_checkpoint and buffers_clean count buffers as having been written even when FlushBuffer() "does nothing" because someone else wrote out the dirty buffer before the bgwriter or checkpointer had a chance to do it. This seems like it would result in an incorrect count. Am I missing something? - Melanie
pgsql-hackers by date: