Thread: bgwriter stats

bgwriter stats

From
Magnus Hagander
Date:
I want to be able to pull some stats out of the bgwriter to be able to
track things. One thing is the total number of buffers written out.
Other things are the "number of checkpoints" and such.

Anyway. Attached patch adds this to the bgwriter shared memory. Is it
safe to do this, and then just have a regular function running in a
normal backend pulling out the value and returning it to the user,
without locking? Given that only the bgwriter can write to it?

Patch of course entirely incomplete, just illustrating the main approach
I've been thinking of taking. Just want to get this question asked and
answered before I go ahead and code more...

//Magnus

Index: src/backend/postmaster/bgwriter.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/postmaster/bgwriter.c,v
retrieving revision 1.36
diff -c -r1.36 bgwriter.c
*** src/backend/postmaster/bgwriter.c    17 Jan 2007 16:25:01 -0000    1.36
--- src/backend/postmaster/bgwriter.c    19 Mar 2007 19:38:27 -0000
***************
*** 119,124 ****
--- 119,125 ----

      int            num_requests;    /* current # of requests */
      int            max_requests;    /* allocated array size */
+     int64        buffers_written; /* number of buffers written */
      BgWriterRequest requests[1];    /* VARIABLE LENGTH ARRAY */
  } BgWriterShmemStruct;

***************
*** 427,433 ****
              last_checkpoint_time = now;
          }
          else
!             BgBufferSync();

          /*
           * Check for archive_timeout, if so, switch xlog files.  First we do a
--- 428,434 ----
              last_checkpoint_time = now;
          }
          else
!             BgWriterShmem->buffers_written += BgBufferSync();

          /*
           * Check for archive_timeout, if so, switch xlog files.  First we do a
Index: src/backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.215
diff -c -r1.215 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c    1 Feb 2007 19:10:27 -0000    1.215
--- src/backend/storage/buffer/bufmgr.c    19 Mar 2007 19:38:27 -0000
***************
*** 986,998 ****
   *
   * This is called periodically by the background writer process.
   */
! void
  BgBufferSync(void)
  {
      static int    buf_id1 = 0;
      int            buf_id2;
      int            num_to_scan;
      int            num_written;

      /* Make sure we can handle the pin inside SyncOneBuffer */
      ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
--- 986,999 ----
   *
   * This is called periodically by the background writer process.
   */
! int
  BgBufferSync(void)
  {
      static int    buf_id1 = 0;
      int            buf_id2;
      int            num_to_scan;
      int            num_written;
+     int            total_written = 0;

      /* Make sure we can handle the pin inside SyncOneBuffer */
      ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
***************
*** 1030,1035 ****
--- 1031,1037 ----
                      break;
              }
          }
+         total_written += num_written;
      }

      /*
***************
*** 1053,1059 ****
--- 1055,1064 ----
              if (++buf_id2 >= NBuffers)
                  buf_id2 = 0;
          }
+         total_written += num_written;
      }
+
+     return total_written;
  }

  /*
Index: src/include/storage/bufmgr.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/storage/bufmgr.h,v
retrieving revision 1.102
diff -c -r1.102 bufmgr.h
*** src/include/storage/bufmgr.h    5 Jan 2007 22:19:57 -0000    1.102
--- src/include/storage/bufmgr.h    19 Mar 2007 19:38:28 -0000
***************
*** 151,157 ****

  extern void BufmgrCommit(void);
  extern void BufferSync(void);
! extern void BgBufferSync(void);

  extern void AtProcExit_LocalBuffers(void);

--- 151,157 ----

  extern void BufmgrCommit(void);
  extern void BufferSync(void);
! extern int BgBufferSync(void);

  extern void AtProcExit_LocalBuffers(void);


Re: bgwriter stats

From
Neil Conway
Date:
Magnus Hagander wrote:
> Anyway. Attached patch adds this to the bgwriter shared memory. Is it
> safe to do this, and then just have a regular function running in a
> normal backend pulling out the value and returning it to the user,
> without locking?
If the variable is an int64, I don't believe so: the architecture might
not implement atomic read/writes of int64 values.

-Neil


Re: bgwriter stats

From
Magnus Hagander
Date:
Neil Conway wrote:
> Magnus Hagander wrote:
>> Anyway. Attached patch adds this to the bgwriter shared memory. Is it
>> safe to do this, and then just have a regular function running in a
>> normal backend pulling out the value and returning it to the user,
>> without locking?
> If the variable is an int64, I don't believe so: the architecture might
> not implement atomic read/writes of int64 values.

Ok. But it should be safe if it's int32?

Actually, since it's just statistics data, it wouldn't be a problem that
it's not atomic, I think. If we really unlucky, we'll get the wrong
value once. But most systems that poll such statistics can deal with
that, in my experience.

Then again, a normal int shouldn't be a problem either.

//Magnus

Re: bgwriter stats

From
Neil Conway
Date:
Magnus Hagander wrote:
> Ok. But it should be safe if it's int32?
>
You should probably use sig_atomic_t, to be safe. Although I believe
that read/writes to "int" are atomic on most platforms, in any case.

> Actually, since it's just statistics data, it wouldn't be a problem that
> it's not atomic, I think. If we really unlucky, we'll get the wrong
> value once.
>
I don't think that's the right attitude to take, at all. Why not just
use a lock? It's not like the overhead will be noticeable.

Alternatively, you can get a consistent read from an int64 variable
using a sig_atomic_t counter, with a little thought. Off the top of my
head, something like the following should work: have the writer
increment the sig_atomic_t counter, adjust the int64 stats value, and
then increment the sig_atomic_t again. Have the reader save a local copy
of the sig_atomic_t counter aside, then read from the int64 counter, and
then recheck the sig_atomic_t counter. Repeat until the local pre-read
and post-read snapshots of the sig_atomic_t counter are identical.

-Neil


Re: bgwriter stats

From
Magnus Hagander
Date:
Neil Conway wrote:
> Magnus Hagander wrote:
>> Ok. But it should be safe if it's int32?
>>
> You should probably use sig_atomic_t, to be safe. Although I believe
> that read/writes to "int" are atomic on most platforms, in any case.

Ok. That's an easy enough change.


>> Actually, since it's just statistics data, it wouldn't be a problem that
>> it's not atomic, I think. If we really unlucky, we'll get the wrong
>> value once.
>>
> I don't think that's the right attitude to take, at all. Why not just
> use a lock? It's not like the overhead will be noticeable.

Probably, but none of the other code appears to take a lock out on it :)


> Alternatively, you can get a consistent read from an int64 variable
> using a sig_atomic_t counter, with a little thought. Off the top of my
> head, something like the following should work: have the writer
> increment the sig_atomic_t counter, adjust the int64 stats value, and
> then increment the sig_atomic_t again. Have the reader save a local copy
> of the sig_atomic_t counter aside, then read from the int64 counter, and
> then recheck the sig_atomic_t counter. Repeat until the local pre-read
> and post-read snapshots of the sig_atomic_t counter are identical.

Thinking more about it, I think that's unnecessary. 32 bits is quite
enough - if you're graphing it (for example), those tools deal with
wraps already. They're usually mdae to deal with things like number of
bytes on a router interface, which is certainly > 32 bit a lot faster
than us.

But I'll take note of that for some time when I actually *need* a 64-bit
value.-

//Magnus

Re: bgwriter stats

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> I want to be able to pull some stats out of the bgwriter to be able to
> track things. One thing is the total number of buffers written out.
> Other things are the "number of checkpoints" and such.

> Anyway. Attached patch adds this to the bgwriter shared memory. Is it
> safe to do this, and then just have a regular function running in a
> normal backend pulling out the value and returning it to the user,
> without locking? Given that only the bgwriter can write to it?

This seems quite a bizarre way to do things.  Why wouldn't you implement
this functionality by shipping messages to the stats collector?

            regards, tom lane

Re: bgwriter stats

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Neil Conway wrote:
>> I don't think that's the right attitude to take, at all. Why not just
>> use a lock? It's not like the overhead will be noticeable.

> Probably, but none of the other code appears to take a lock out on it :)

Huh?  It doesn't use a lock for touching the checkpoint counters, but
that's OK because they're sig_atomic_t.

            regards, tom lane

Re: bgwriter stats

From
Darcy Buskermolen
Date:
On Monday 19 March 2007 15:32, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > I want to be able to pull some stats out of the bgwriter to be able to
> > track things. One thing is the total number of buffers written out.
> > Other things are the "number of checkpoints" and such.
> >
> > Anyway. Attached patch adds this to the bgwriter shared memory. Is it
> > safe to do this, and then just have a regular function running in a
> > normal backend pulling out the value and returning it to the user,
> > without locking? Given that only the bgwriter can write to it?
>
> This seems quite a bizarre way to do things.  Why wouldn't you implement
> this functionality by shipping messages to the stats collector?

I'm with Tom on this one.. All of our current stats are done via the stats
collector, we should continue that way.  While we are on the subject of
stats, does anybody else feel there is merrit in haveing block level writes
tracked on a relation by relation bases?

>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

Re: bgwriter stats

From
Neil Conway
Date:
Tom Lane wrote:
> This seems quite a bizarre way to do things.  Why wouldn't you implement
> this functionality by shipping messages to the stats collector?
>

Would that have any benefits over the shmem approach?

-Neil


Re: bgwriter stats

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Tom Lane wrote:
>> This seems quite a bizarre way to do things.  Why wouldn't you implement
>> this functionality by shipping messages to the stats collector?

> Would that have any benefits over the shmem approach?

Well, for one thing, it would fit naturally into the existing stats
structure instead of being a wart on the side.  The problem of atomic
access to an int64 would go away, yet we'd still be able to keep a
running int64 total of the reports.  You wouldn't lose the total over a
shutdown/restart.  The value would obey the transactional-snapshot rules
we've established for stats output, making it safe to try to correlate
it with other stats.  Probably a few other things I'm not thinking of...

            regards, tom lane

Re: bgwriter stats

From
Tom Lane
Date:
"Magnus Hagander" <magnus@hagander.net> writes:
>> This seems quite a bizarre way to do things.  Why wouldn't you implement
>> this functionality by shipping messages to the stats collector?

> Would you suggest doing the same with the checkpoint counter, that's already in shared mem? I want to expose that
numberas well.. 

The shared-mem checkpoint counters serve an entirely different purpose:
they're there to let backends detect when their requested checkpoint has
been completed.  They're not intended to count checkpoints over any
long term (remember that sig_atomic_t need only be 8 bits wide).

If you want to track stats about how many checkpoints have been done,
I think that's a job for the stats collector.

            regards, tom lane