Re: Access statistics - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Access statistics
Date
Msg-id 200106011331.f51DV5P02078@jupiter.us.greatbridge.com
Whole thread Raw
In response to Re: Access statistics  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: Access statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Peter Eisentraut wrote:
> Jan Wieck writes:
>
> >     In  the  next  couple of hours (at least tomorrow) I would be
> >     ready to commit the backend changes  for  table-/index-access
> >     statistics and current backend activity views.
> >
> >     Should  I  apply  the patches or provide a separate patch for
> >     review first?
>
> Maybe you could describe what it's going to do and how it's going to work.
   Real  programmers  don't comment - if it was hard to write it   should be hard to read :-)
   So outing myself not beeing a *real programmer*, this is what   I have so far:
   * On startup the postmaster creates an INET domain UDP socket     and bind(2)'s it to localhost:0, meaning  the
kernel will     assign  a  yet unused, unprivileged port that could be seen     with getsockaddr(2).
 
     It then starts two background processes  of  which  one  is     simply a wraparound buffer doing recvfrom(2) on
thesocket,     checking that the source address of the received packets is     the  sockets  own  address (!) and
forwardingapproved ones     over a pipe to the second one, discribed later.
 
   * Backends call some collector functions  at  various  places     now  (these  will  finally  be macros), that count
uptable     scans, tuples returned by scans,  buffer  fetches/hits  and     the like. At the beginning of a statement
thebackends send     a message telling the first couple of hundred bytes of  the     querystring  and  after  the
statementis done (just before     getting ready for the next command) they send the collected     access numbers.
 
     Tables,  indexes etc. in these statistics are identified by     OID, so the data doesn't tell much so far.
   * The second  background  process  fired  by  the  postmaster     collects  these  numbers  into hashtables and as
longas it     receives messages, it'll write out a summary file every 500     or  so  milliseconds,  telling a snapshot
ofcurrent stats.     Using select(2) with timeouts ensures that a complete  idle     DB  instance not to waste a single
CPUcycle or IO to write     these snapshots.
 
     On startup it tries to read the last snapshot file  in,  so     the collected statistics survive a postmaster
restart.
     Vacuum  reads  the  file too and sends bulk delete messages     for  objects  that  are  gone.  So  the  stats
don't grow     infinitely.
 
   * A  bunch  of  new  builtin  functions  gain  access  to the     snapshot file. At first call  of  one  of  these
functions    during  a  transaction,  the  backend will read the current     file and return the numbers from in memory
then.
     Based on these functions a couple of views can  tell  these     collected  stats.  Information  from  the
databasessystem     catalog is of course required to identify  the  objects  in     the  stats,  but  I think those
informationsshould only be     visible to someone who identified herself  as  a  valid  DB     user anyway.
 
     The  visibility  of  querystrings  (this  info is available     cross DB) is restricted to DB superusers.
   There has been discussion already about  using  an  INET  vs.   UNIX  UDP  socket for the communication. At least
forLinux I   found INET to be the most effective way of communication. And   for  security  concerns: If someone else
thanroot can really   send packets to that  socket  that  show  up  with  a  source   address  of  127.0.0.1:n,  where
n is a portnumber actually   occupied by your own socket, be sure you'll have more  severe   problems than modified
accessstatistics.
 
   The views should be considered examples. The final naming and   layout is subject for discussion.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



pgsql-hackers by date:

Previous
From: Michael Samuel
Date:
Subject: Re: Re: Support for %TYPE in CREATE FUNCTION
Next
From: Tom Lane
Date:
Subject: Re: ERROR: cache lookup for proc 43030134 failed