Thread: Access statistics

Access statistics

From
Jan Wieck
Date:
One more feature for discussion :-)
   In  the  next  couple of hours (at least tomorrow) I would be   ready to commit the backend changes  for
table-/index-access  statistics and current backend activity views.
 
   Should  I  apply  the patches or provide a separate patch for   review first?


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: Access statistics

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
>     In  the  next  couple of hours (at least tomorrow) I would be
>     ready to commit the backend changes  for  table-/index-access
>     statistics and current backend activity views.
>     Should  I  apply  the patches or provide a separate patch for
>     review first?

Considering that you've not offered any detailed information about
what you plan to do (AFAIR), a patch for review first would be polite ...
        regards, tom lane


Re: Access statistics

From
Bruce Momjian
Date:
> One more feature for discussion :-)
> 
>     In  the  next  couple of hours (at least tomorrow) I would be
>     ready to commit the backend changes  for  table-/index-access
>     statistics and current backend activity views.
> 
>     Should  I  apply  the patches or provide a separate patch for
>     review first?

I like doing a cvs diff -c and throwing the patch to PATCHES just before
commit.  That way, people can see may changes easier and find problems. 
Of course, if I am at all unsure, I post to patches and wait 2 days.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Access statistics

From
Jan Wieck
Date:
Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
> >     In  the  next  couple of hours (at least tomorrow) I would be
> >     ready to commit the backend changes  for  table-/index-access
> >     statistics and current backend activity views.
> >     Should  I  apply  the patches or provide a separate patch for
> >     review first?
>
> Considering that you've not offered any detailed information about
> what you plan to do (AFAIR), a patch for review first would be polite ...
   We  had that discussion a couple of weeks ago down to if it's   better to use UNIX or INET domain UDP sockets. But I
expected  it  not  to  be detailed enough for our current quality level   :-)
 
   It's incomplete anyway since the per  database  configuration   in  pg_database  is  missing and some other details
Ineed to   tidy up. So I'll wrap up a patch tomorrow.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: Access statistics

From
Peter Eisentraut
Date:
Jan Wieck writes:

>     In  the  next  couple of hours (at least tomorrow) I would be
>     ready to commit the backend changes  for  table-/index-access
>     statistics and current backend activity views.
>
>     Should  I  apply  the patches or provide a separate patch for
>     review first?

Maybe you could describe what it's going to do and how it's going to work.

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



Re: Access statistics

From
Jan Wieck
Date:
Peter Eisentraut wrote:
> Jan Wieck writes:
>
> >     In  the  next  couple of hours (at least tomorrow) I would be
> >     ready to commit the backend changes  for  table-/index-access
> >     statistics and current backend activity views.
> >
> >     Should  I  apply  the patches or provide a separate patch for
> >     review first?
>
> Maybe you could describe what it's going to do and how it's going to work.
   Real  programmers  don't comment - if it was hard to write it   should be hard to read :-)
   So outing myself not beeing a *real programmer*, this is what   I have so far:
   * On startup the postmaster creates an INET domain UDP socket     and bind(2)'s it to localhost:0, meaning  the
kernel will     assign  a  yet unused, unprivileged port that could be seen     with getsockaddr(2).
 
     It then starts two background processes  of  which  one  is     simply a wraparound buffer doing recvfrom(2) on
thesocket,     checking that the source address of the received packets is     the  sockets  own  address (!) and
forwardingapproved ones     over a pipe to the second one, discribed later.
 
   * Backends call some collector functions  at  various  places     now  (these  will  finally  be macros), that count
uptable     scans, tuples returned by scans,  buffer  fetches/hits  and     the like. At the beginning of a statement
thebackends send     a message telling the first couple of hundred bytes of  the     querystring  and  after  the
statementis done (just before     getting ready for the next command) they send the collected     access numbers.
 
     Tables,  indexes etc. in these statistics are identified by     OID, so the data doesn't tell much so far.
   * The second  background  process  fired  by  the  postmaster     collects  these  numbers  into hashtables and as
longas it     receives messages, it'll write out a summary file every 500     or  so  milliseconds,  telling a snapshot
ofcurrent stats.     Using select(2) with timeouts ensures that a complete  idle     DB  instance not to waste a single
CPUcycle or IO to write     these snapshots.
 
     On startup it tries to read the last snapshot file  in,  so     the collected statistics survive a postmaster
restart.
     Vacuum  reads  the  file too and sends bulk delete messages     for  objects  that  are  gone.  So  the  stats
don't grow     infinitely.
 
   * A  bunch  of  new  builtin  functions  gain  access  to the     snapshot file. At first call  of  one  of  these
functions    during  a  transaction,  the  backend will read the current     file and return the numbers from in memory
then.
     Based on these functions a couple of views can  tell  these     collected  stats.  Information  from  the
databasessystem     catalog is of course required to identify  the  objects  in     the  stats,  but  I think those
informationsshould only be     visible to someone who identified herself  as  a  valid  DB     user anyway.
 
     The  visibility  of  querystrings  (this  info is available     cross DB) is restricted to DB superusers.
   There has been discussion already about  using  an  INET  vs.   UNIX  UDP  socket for the communication. At least
forLinux I   found INET to be the most effective way of communication. And   for  security  concerns: If someone else
thanroot can really   send packets to that  socket  that  show  up  with  a  source   address  of  127.0.0.1:n,  where
n is a portnumber actually   occupied by your own socket, be sure you'll have more  severe   problems than modified
accessstatistics.
 
   The views should be considered examples. The final naming and   layout is subject for discussion.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: Access statistics

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
>     So outing myself not beeing a *real programmer*, this is what
>     I have so far:

Hmm ... what is the performance of all this like?  Seems like a lot
of overhead.  Can it be turned off?

>     * Backends call some collector functions  at  various  places
>       now  (these  will  finally  be macros), that count up table
>       scans, tuples returned by scans,  buffer  fetches/hits  and
>       the like.

Have you removed the existing stats-gathering support
(backend/access/heap/stats.c and so on)?   That would buy back
at least a few of the cycles involved ...
        regards, tom lane


Re: Access statistics

From
Jan Wieck
Date:
Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
> >     So outing myself not beeing a *real programmer*, this is what
> >     I have so far:
>
> Hmm ... what is the performance of all this like?  Seems like a lot
> of overhead.  Can it be turned off?
   Current  performance loss is about 2-4% wallclock measured. I   expect it to become better when turning some of the
functions  into macros.
 
   The  plan is to add another column to pg_database that can be   used to turn it on/off on a per database level.
Backendsjust   decide  at startup if they collect and send for their session   lifetime.
 

> >     * Backends call some collector functions  at  various  places
> >       now  (these  will  finally  be macros), that count up table
> >       scans, tuples returned by scans,  buffer  fetches/hits  and
> >       the like.
>
> Have you removed the existing stats-gathering support
> (backend/access/heap/stats.c and so on)?   That would buy back
> at least a few of the cycles involved ...
   Not sure if we really should.  Let's  later  decide  if  it's   really obsolete.

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: Access statistics

From
Tom Lane
Date:
Jan Wieck <JanWieck@yahoo.com> writes:
> Tom Lane wrote:
>> Have you removed the existing stats-gathering support
>> (backend/access/heap/stats.c and so on)?   That would buy back
>> at least a few of the cycles involved ...

>     Not sure if we really should.  Let's  later  decide  if  it's
>     really obsolete.

Considering that Bruce long ago ifdef'd out all the code that could
actually *do* anything with those stats (like print them), I'd say
it's obsolete.  In any case, it's too confusing to have two sets of
stats-gathering code in there.  I vote for getting rid of the old
stuff.
        regards, tom lane


RE: Access statistics

From
Mike Mascari
Date:
For what its worth,

Oracle has a nice feature for resource management called PROFILEs:

CREATE PROFILE profile LIMIT
[ SESSION_PER_USER [ session_limit | UNLIMITED | DEFAULT ] ]
[ CPU_PER_SESSION cpu_session_limit | UNLIMITED | DEFAULT ] ]
[ CPU_PER_CALL cpu_call_limit | UNLIMITED | DEFAULT ] ]
[ CONNECT_TIME connect_limit | UNLIMITED | DEFAULT ] ]
[ IDLE_TIME idle_limit | UNLIMITED | DEFAULT ] ]        
[ LOGICAL READS_PER_SESSION read_session_limit | UNLIMITED | DEFAULT 
] ]        
[ LOGICAL READS_PER_CALL read_call_limit | UNLIMITED | DEFAULT ] ]        
[ PRIVATE_SGA memory_limit | UNLIMITED | DEFAULT ] ]        
[ COMPOSITE_LIMIT resource_cost_limit | UNLIMITED | DEFAULT ] ]

which limits things like CPU_PER_CALL and LOGICAL_READS_PER_SESSION 
to a profile. The ALTER USER command then allows you to assign a 
PROFILE to a user. This is really nice, since you can prevent 
run-away queries from denying service by the database to other users. 
It also can prevent a user from soaking up all of the available 
connections. You must set a flag in your initSID.ora configuration 
profile for ORACLE to support profiles. Since Jan is collecting these 
statistics any way (if the appropriate configuration flag is set), it 
would be pretty trivial to implement PROFILEs in PostgreSQL.

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From:    Tom Lane [SMTP:tgl@sss.pgh.pa.us]

Jan Wieck <JanWieck@Yahoo.com> writes:
>     So outing myself not beeing a *real programmer*, this is what
>     I have so far:

Hmm ... what is the performance of all this like?  Seems like a lot
of overhead.  Can it be turned off?

>     * Backends call some collector functions  at  various  places
>       now  (these  will  finally  be macros), that count up table
>       scans, tuples returned by scans,  buffer  fetches/hits  and
>       the like.

Have you removed the existing stats-gathering support
(backend/access/heap/stats.c and so on)?   That would buy back
at least a few of the cycles involved ...
        regards, tom lane



Re: Access statistics

From
Bruce Momjian
Date:
> Tom Lane wrote:
> > Jan Wieck <JanWieck@Yahoo.com> writes:
> > >     So outing myself not beeing a *real programmer*, this is what
> > >     I have so far:
> >
> > Hmm ... what is the performance of all this like?  Seems like a lot
> > of overhead.  Can it be turned off?
> 
>     Current  performance loss is about 2-4% wallclock measured. I
>     expect it to become better when turning some of the functions
>     into macros.

At 2-4%, I assume it is not enabled by default.  I can see the query
string part being enabled by default though.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Access statistics

From
Bruce Momjian
Date:
> Jan Wieck <JanWieck@yahoo.com> writes:
> > Tom Lane wrote:
> >> Have you removed the existing stats-gathering support
> >> (backend/access/heap/stats.c and so on)?   That would buy back
> >> at least a few of the cycles involved ...
> 
> >     Not sure if we really should.  Let's  later  decide  if  it's
> >     really obsolete.
> 
> Considering that Bruce long ago ifdef'd out all the code that could
> actually *do* anything with those stats (like print them), I'd say
> it's obsolete.  In any case, it's too confusing to have two sets of
> stats-gathering code in there.  I vote for getting rid of the old
> stuff.

I agree.  Rip away.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026