Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (andmore?) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (andmore?)
Date
Msg-id 20200126202203.55lv2u63ptkunnt4@alap3.anarazel.de
Whole thread Raw
In response to Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)  (Magnus Hagander <magnus@hagander.net>)
Responses Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (andmore?)
List pgsql-hackers
Hi,

On 2020-01-26 16:20:03 +0100, Magnus Hagander wrote:
> On Sun, Jan 26, 2020 at 1:44 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2020-01-25 15:43:41 +0100, Magnus Hagander wrote:
> > > On Fri, Jan 24, 2020 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:
> > > > Lastly, I don't understand what the point of sending fixed size stats,
> > > > like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
> > > > I don't like it's architecture, we obviously need something like pgstat
> > > > to handle variable amounts of stats (database, table level etc
> > > > stats). But that doesn't at all apply to these types of global stats.
> > >
> > > That part has annoyed me as well a few times. +1 for just moving that
> > > into a global shared memory. Given that we don't really care about
> > > things being in sync between those different counters *or* if we loose
> > > a bit of data (which the stats collector is designed to do), we could
> > > even do that without a lock?
> >
> > I don't think we'd quite want to do it without any (single counter)
> > synchronization - high concurrency setups would be pretty likely to
> > loose values that way. I suspect the best would be to have a struct in
> > shared memory that contains the potential counters for each potential
> > process. And then sum them up when actually wanting the concrete
> > value. That way we avoid unnecessary contention, in contrast to having a
> > single shared memory value for each(which would just pingpong between
> > different sockets and store buffers).  There's a few details like how
> > exactly to implement resetting the counters, but ...
> 
> Right. Each process gets to do their own write, but still in shared
> memory. But do you need to lock them when reading them (for the
> summary)? That's the part where I figured you could just read and
> summarize them, and accept the possible loss.

Oh, yea, I'd not lock for that. On nearly all machines aligned 64bit
integers can be read / written without a danger of torn values, and I
don't think we need perfect cross counter accuracy. To deal with the few
platforms without 64bit "single copy atomicity", we can just use
pg_atomic_read/write_u64. These days (e8fdbd58fe) they automatically
fall back to using locked operations for those platforms.  So I don't
think there's actually a danger of loss.

Obviously we could also use atomic ops to increment the value, but I'd
rather not add all those atomic operations, even if it's on uncontended
cachelines. It'd allow us to reset the backend values more easily by
just swapping in a 0, which we can't do if the backend increments
non-atomically. But I think we could instead just have one global "bias"
value to implement resets (by subtracting that from the summarized
value, and storing the current sum when resetting). Or use the new
global barrier to trigger a reset. Or something similar.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Strange coding in _mdfd_openseg()
Next
From: Andres Freund
Date:
Subject: Re: error context for vacuum to include block number