Re: pg_stat_io not tracking smgrwriteback() is confusing - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: pg_stat_io not tracking smgrwriteback() is confusing |
Date | |
Msg-id | 20230504164402.5ek3qr3mdaglou4m@awork3.anarazel.de Whole thread Raw |
In response to | Re: pg_stat_io not tracking smgrwriteback() is confusing (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
Hi, On 2023-04-27 11:36:49 -0400, Melanie Plageman wrote: > > > /* and finally tell the kernel to write the data to storage */ > > > reln = smgropen(currlocator, InvalidBackendId); > > > smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, nblocks); > > Yes, as it is currently, IssuePendingWritebacks() is only used for shared > buffers. My rationale for including IOObject is that localbuf.c calls > smgr* functions and there isn't anything stopping it from calling > smgrwriteback() or using WritebackContexts (AFAICT). I think it's extremely unlikely that we'll ever do that, because it's very common to have temp tables that are bigger than temp_buffers. We basically hope that the kernel can do good caching for us there. > > Or I actually think we might not even need to pass around the io_* > > parameters and could just pass immediate values to the > > pgstat_count_io_op_time call. If we ever start using shared buffers > > for thing other than relation files (for example SLRU?), we'll have to > > consider the target individually for each buffer block. That being > > said, I'm fine with how it is either. > > In IssuePendingWritebacks() we don't actually know which IOContext we > are issuing writebacks for when we call pgstat_count_io_op_time() (we do > issue pending writebacks for other IOContexts than IOCONTEXT_NORMAL). I > agree IOObject is not strictly necessary right now. I've kept IOObject a > member of WritebackContext for the reasons I mention above, however, I > am open to removing it if it adds confusion. I don't think it's really worth adding struct members for potential future safety. We can just add them later if we end up needing them. > From 7cdd6fc78ed82180a705ab9667714f80d08c5f7d Mon Sep 17 00:00:00 2001 > From: Melanie Plageman <melanieplageman@gmail.com> > Date: Mon, 24 Apr 2023 18:21:54 -0400 > Subject: [PATCH v4] Add writeback to pg_stat_io > > 28e626bde00 added the notion of IOOps but neglected to include > writeback. With the addition of IO timing to pg_stat_io in ac8d53dae5, > the omission of writeback caused some confusion. Checkpointer write > timing in pg_stat_io often differed greatly from the write timing > written to the log. To fix this, add IOOp IOOP_WRITEBACK and track > writebacks and writeback timing in pg_stat_io. For the future: It'd be good to note that catversion needs to be increased. Off-topic: I've been wondering about computing a "catversion hash" based on all the files going into initdb. At least to have some tooling to detect missing catversion increases... > index 99f7f95c39..27b6f1a0a0 100644 > --- a/doc/src/sgml/monitoring.sgml > +++ b/doc/src/sgml/monitoring.sgml > @@ -3867,6 +3867,32 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i > </entry> > </row> > > + <row> > + <entry role="catalog_table_entry"> > + <para role="column_definition"> > + <structfield>writebacks</structfield> <type>bigint</type> > + </para> > + <para> > + Number of units of size <varname>op_bytes</varname> which the backend > + requested the kernel write out to permanent storage. > + </para> > + </entry> > + </row> I think the reference to "backend" here is somewhat misplaced - it could be checkpointer or bgwriter as well. We don't reference the backend in other comparable columns of pgsio either... > diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c > index 0057443f0c..a7182fe95a 100644 > --- a/src/backend/storage/buffer/buf_init.c > +++ b/src/backend/storage/buffer/buf_init.c > @@ -145,9 +145,15 @@ InitBufferPool(void) > /* Init other shared buffer-management stuff */ > StrategyInitialize(!foundDescs); > > - /* Initialize per-backend file flush context */ > - WritebackContextInit(&BackendWritebackContext, > - &backend_flush_after); > + /* > + * Initialize per-backend file flush context. IOContext is initialized to > + * IOCONTEXT_NORMAL because this is the most common context. IOObject is > + * initialized to IOOBJECT_RELATION because writeback is currently only > + * requested for permanent relations in shared buffers. The backend can > + * overwrite these as appropriate. > + */ > + WritebackContextInit(&BackendWritebackContext, IOOBJECT_RELATION, > + IOCONTEXT_NORMAL, &backend_flush_after); > } > This seems somewhat icky. > /* > diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c > index 1fa689052e..116910cdfe 100644 > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -1685,6 +1685,8 @@ again: > FlushBuffer(buf_hdr, NULL, IOOBJECT_RELATION, io_context); > LWLockRelease(content_lock); > > + BackendWritebackContext.io_object = IOOBJECT_RELATION; > + BackendWritebackContext.io_context = io_context; > ScheduleBufferTagForWriteback(&BackendWritebackContext, > &buf_hdr->tag); > } What about passing the io_context to ScheduleBufferTagForWriteback instead? > --- a/src/test/regress/sql/stats.sql > +++ b/src/test/regress/sql/stats.sql Hm. Could we add a test for this? While it's not implemented everywhere, we still issue the smgrwriteback() afaics. The default for the _flush_after GUCs changes, but backend_flush_after is USERSET, so we could just change it for a single command. Greetings, Andres Freund
pgsql-hackers by date: