Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) |
Date | |
Msg-id | CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU+ECH4tNwGFNERkZA@mail.gmail.com Whole thread Raw |
In response to | Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?) |
List | pgsql-hackers |
Thanks again! I really appreciate the thorough review. I have combined responses to all three of your emails below. Let me know if it is more confusing to do it this way. On Wed, Dec 1, 2021 at 6:59 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Wed, Dec 01, 2021 at 05:00:14PM -0500, Melanie Plageman wrote: > > > Also: > > > src/include/miscadmin.h:#define BACKEND_NUM_TYPES (B_LOGGER + 1) > > > > > > I think it's wrong to say NUM_TYPES = B_LOGGER + 1 (which would suggest using > > > lessthan-or-equal instead of lessthan as you are). > > > > > > Since the valid backend types start at 1 , the "count" of backend types is > > > currently B_LOGGER (13) - not 14. I think you should remove the "+1" here. > > > Then NROWS (if it continued to exist at all) wouldn't need to subtract one. > > > > I think what I currently have is technically correct because I start at > > 1 when I am using it as a loop condition. I do waste a spot in the > > arrays I allocate with BACKEND_NUM_TYPES size. > > > > I was hesitant to make the value of BACKEND_NUM_TYPES == B_LOGGER > > because it seems kind of weird to have it have the same value as the > > B_LOGGER enum. > > I don't mean to say that the code is misbehaving - I mean "num_x" means "the > number of x's" - how many there are. Since the first, valid backend type is 1, > and they're numbered consecutively and without duplicates, then "the number of > backend types" is the same as the value of the last one (B_LOGGER). It's > confusing if there's a macro called BACKEND_NUM_TYPES which is greater than the > number of backend types. > > Most loops say for (int i=0; i<NUM; ++i) > If it's 1-based, they say for (int i=1; i<=NUM; ++i) > You have two different loops like: > > + for (int i = 0; i < BACKEND_NUM_TYPES - 1 ; i++) > + for (int backend_type = 1; backend_type < BACKEND_NUM_TYPES; backend_type++) > > Both of these iterate over the correct number of backend types, but they both > *look* wrong, which isn't desirable. I've changed this and added comments wherever I could to make it clear. Whenever the parameter was of type BackendType, I tried to use the correct (not adjusted by subtracting 1) number and wherever the type was int and being used as an index into the array, I used the adjusted value and added the idx suffix to make it clear that the number does not reflect the actual BackendType: On Wed, Dec 1, 2021 at 10:31 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Wed, Dec 01, 2021 at 04:59:44PM -0500, Melanie Plageman wrote: > > Thanks for the review! > > > > On Wed, Nov 24, 2021 at 8:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > There's extraneous blank lines in these functions: > > > +pgstat_recv_resetsharedcounter > > I didn't see one here > > => The extra blank line is after the RESET_BUFFERS memset. Fixed. > > + * Reset the global, bgwriter and checkpointer statistics for the > > + * cluster. > > The first comma in this comment was introduced in 1bc8e7b09, and seems to be > extraneous, since bgwriter and checkpointer are both global. With the comma, > it looks like it should be memsetting 3 things. Fixed. > > + /* Don't count dead backends. They should already be counted */ > > Maybe this comment should say ".. they'll be added below" Fixed. > > + row[COLUMN_BACKEND_TYPE] = backend_type_desc; > > + row[COLUMN_IO_PATH] = CStringGetTextDatum(GetIOPathDesc(io_path)); > > + row[COLUMN_ALLOCS] += io_ops->allocs - resets->allocs; > > + row[COLUMN_EXTENDS] += io_ops->extends - resets->extends; > > + row[COLUMN_FSYNCS] += io_ops->fsyncs - resets->fsyncs; > > + row[COLUMN_WRITES] += io_ops->writes - resets->writes; > > + row[COLUMN_RESET_TIME] = reset_time; > > It'd be clearer if RESET_TIME were set adjacent to BACKEND_TYPE and IO_PATH. If you mean just in the order here (not in the column order in the view), then I have changed it as you recommended. > This message needs to be updated: > errhint("Target must be \"archiver\", \"bgwriter\", or \"wal\"."))) Done. > When I query the view, I see reset times as: 1999-12-31 18:00:00-06. > I guess it should be initialized like this one: > globalStats.bgwriter.stat_reset_timestamp = ts Done. > The cfbot shows failures now (I thought it was passing with the previous patch, > but I suppose I'm wrong.) > > It looks like running recovery during single user mode hits this assertion. > TRAP: FailedAssertion("beentry", File: "../../../../src/include/utils/backend_status.h", Line: 359, PID: 3499) > Yes, thank you for catching this. I have moved up pgstat_beinit and pgstat_bestart so that single user mode process will also have PgBackendStatus. I also have to guard against sending these stats to the collector since there is no room for B_INVALID backendtype in the array of IO Op values. With this change `make check-world` passes on my machine. On Wed, Dec 1, 2021 at 11:06 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Wed, Dec 01, 2021 at 04:59:44PM -0500, Melanie Plageman wrote: > > Thanks for the review! > > > > On Wed, Nov 24, 2021 at 8:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > You wrote beentry++ at the start of two loops, but I think that's wrong; it > > > should be at the end, as in the rest of the file (or as a loop increment). > > > BackendStatusArray[0] is actually used (even though its backend has > > > backendId==1, not 0). "MyBEEntry = &BackendStatusArray[MyBackendId - 1];" > > > > I've fixed this in v16 which I will attach to the next email in the thread. > > I just noticed that since beentry++ is now at the end of the loop, it's being > missed when you "continue": > > + if (beentry->st_procpid == 0) > + continue; Fixed. > Also, I saw that pgindent messed up and added spaces after pointers in function > declarations, due to new typedefs not in typedefs.list: > > -pgstat_send_buffers_reset(PgStat_MsgResetsharedcounter *msg) > +pgstat_send_buffers_reset(PgStat_MsgResetsharedcounter * msg) > > -static inline void pg_atomic_inc_counter(pg_atomic_uint64 *counter) > +static inline void > +pg_atomic_inc_counter(pg_atomic_uint64 * counter) Fixed. -- Melanie
Attachment
- v17-0007-small-comment-correction.patch
- v17-0006-Remove-superfluous-bgwriter-stats.patch
- v17-0005-Add-system-view-tracking-IO-ops-per-backend-type.patch
- v17-0003-Send-IO-operations-to-stats-collector.patch
- v17-0004-Add-buffers-to-pgstat_reset_shared_counters.patch
- v17-0002-Add-IO-operation-counters-to-PgBackendStatus.patch
- v17-0001-Read-only-atomic-backend-write-function.patch
pgsql-hackers by date: