Re: Performance monitor signal handler - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Performance monitor signal handler
Date
Msg-id 200103191804.NAA20163@jupiter.jw.home
Whole thread Raw
In response to Re: Performance monitor signal handler  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Performance monitor signal handler
List pgsql-hackers
Bruce Momjian wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > Only shared memory gives us near-zero cost for write/read.  99% of
> > > backends will not be using stats, so it has to be cheap.
> >
> > Not with a circular buffer it's not cheap, because you need interlocking
> > on writes.  Your claim that you can get away without that is simply
> > false.  You won't just get lost messages, you'll get corrupted messages.
>
> How do I get corrupt messages if they are all five bytes?  If I write
> five bytes, and another does the same, I guess the assembler could
> intersperse the writes so the oid gets to be a corrupt value.  Any cheap
> way around this, perhaps by skiping/clearing the write on a collision?
>
> >
> > > The collector program can read the shared memory stats and keep hashed
> > > values of accumulated stats.  It uses the "Loops" variable to know if it
> > > has read the current information in the buffer.
> >
> > And how does it sleep until the counter has been advanced?  Seems to me
> > it has to busy-wait (bad) or sleep (worse; if the minimum sleep delay
> > is 10 ms then it's guaranteed to miss a lot of data under load).
>
> I figured it could just wake up every few seconds and check.  It will
> remember the loop counter and current pointer, and read any new
> information.  I was thinking of a 20k buffer, which could cover about 4k
> events.
   Here  I  wonder what your EVENT is. With an Oid as identifier   and a 1 byte (even if it'd be anoter 32-bit value),
how many   messages do you want to generate to get these statistics:
 
   -   Number of sequential scans done per table.   -   Number of tuples returned via sequential scans per table.   -
Numberof buffer cache lookups  done  through  sequential       scans per table.   -   Number  of  buffer  cache  hits
forsequential scans per       table.   -   Number of tuples inserted per table.   -   Number of tuples updated per
table.  -   Number of tuples deleted per table.   -   Number of index scans done per index.   -   Number of index
tuplesreturned per index.   -   Number of buffer cache lookups  done  due  to  scans  per       index.   -   Number of
buffercache hits per index.   -   Number  of  valid heap tuples returned via index scan per       index.   -   Number
ofbuffer cache lookups done for heap fetches  via       index scan per index.   -   Number  of  buffer  cache hits for
heapfetches via index       scan per index.   -   Number of buffer cache lookups not accountable for any of       the
above.  -   Number  of  buffer  cache hits not accountable for any of       the above.
 
   What I see is that there's a difference in what we  two  want   to see in the statistics. You're talking about
lookingat the   actual querystring and such. That's  information  useful  for   someone   actually  looking  at  a
server, to  see  what  a   particular backend  is  doing.  On  my  notebook  a  parallel   regression  test
(containing>4,000 queries) passes by under   1:30, that's more than 40 queries per second. So that doesn't   tell me
much.
   What I'm after is to collect the above data over a week or so   and then generate a report to identify the hot spots
of  the   schema.  Which tables/indices cause the most disk I/O, what's   the average percentage of tuples returned in
scans(not  from   the  query, I mean from the single scan inside of the joins).   That's the information I need  to
know where  to  look  for   possibly  better  qualifications, useless indices that aren't   worth to maintain and the
like.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



pgsql-hackers by date:

Previous
From: Martin Renters
Date:
Subject: Re: beta6 pg_restore core dumps
Next
From: Bruce Momjian
Date:
Subject: Re: Performance monitor signal handler