Re: Review: DTrace probes (merged version) - Mailing list pgsql-hackers

From Robert Lor
Subject Re: Review: DTrace probes (merged version)
Date
Msg-id 4884E7CC.60208@sun.com
Whole thread Raw
In response to Review: DTrace probes (merged version)  (Zdenek Kotala <Zdenek.Kotala@Sun.COM>)
Responses Re: Review: DTrace probes (merged version)
Re: Review: DTrace probes (merged version) ver_03
List pgsql-hackers
Apologies for the delayed response - vacation, travel, etc got in the way!


Zdenek Kotala wrote:
> I performed review of merged patch from Robert Treat. At first point
> the patch does not work (SunOS 5.11 snv_86 sun4u sparc
> SUNW,Sun-Fire-V240)

The attached patch fixed the regression test errors.
>
>
> However I went through a code and I have following comments:
>
> 1) Naming convention:
>
>  - Some probes uses "*end", some "*done". It would be good to select
> one name.
Noted. Will use <name>-done per the convention. This change will be
included in an updated patch later since I think there are a number of
other changes that need to be made.
>  - I prefer to use clog instead of slru in probes name. clog is widely
> known.
I think slru- is okay per your subsequent emails.
>  - It seems to me better to have checkpoint-clog...,
> checkpoint-subtrans instead of clog-checkpoint.
Yes, I was thinking about this too, but the current names are more
consistent with the others. For example:

buffer-checkpoint, buffer-*
xlog-checkpoint, xlog-*
>  - buffer-flush was originally dirty-buffer-write-start. I prefer
> Robert Lor's naming.
Actually, I made this change so the name is consistent with the other
buffer-* probes.
>
> 2) storage read write probes
>
> smgr-read*, smgr-writes probes are in md.c. I personally think it make
> more sense to put them into smgr.c. Only advantage to have it in md.c
> is that actual amount of bytes is possible to monitor.
The current probes return number of bytes, that's why they are in md.c
>
> 3) query rewrite probe
>
> There are several probes for query measurement but query rewrite phase
> is missing. See analyze_and_rewrite or pg_parse_and_rewrite
I believe the rewrite time is accounted for in the query-plan probe.
Need to double check on this.
>
> 4) autovacuum_start
>
> Autovacuum_start probe is alone. I propose following probes for
> completeness:
>
> proc-autovacuum-start
> proc-autovacuum-stop
> proc-bgwriter-start
> proc-bgwriter-stop
> proc-backend-start
> proc-backend-stop
> proc-master-start
> proc-master-stop
Saw a number of emails on this. Will get back on this.
>
> 5) Need explain of usage:
>
> I have some doubts about following probes. Could you please explain
> usage of them? example dtrace script is welcome
>
>  - all exec-* probes
>  - mark-dirty, local-mark-dirty
>
Theo/Robert, do you guys have any sample scripts on the probes you added.
>
> 6) several comments about placement:
>
> I published patch on http://reviewdemo.postgresql.org/r/25/. I added
> several comments there.
>
> 7) SLRU/CLOG
>
> SLRU probes could be return more info. For example if page was in
> buffer or if physical write is not necessary and so on.
Yes, more info could be returned if we can identify specific use cases
that the new data will enable.

--
Robert Lor           Sun Microsystems
Austin, USA          http://sun.com/postgresql

Index: src/backend/access/transam/clog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/clog.c,v
retrieving revision 1.46
diff -u -3 -p -r1.46 clog.c
--- src/backend/access/transam/clog.c    1 Jan 2008 19:45:46 -0000    1.46
+++ src/backend/access/transam/clog.c    21 Jul 2008 18:14:48 -0000
@@ -36,6 +36,7 @@
 #include "access/slru.h"
 #include "access/transam.h"
 #include "postmaster/bgwriter.h"
+#include "pg_trace.h"

 /*
  * Defines for CLOG page sizes.  A page is the same BLCKSZ as is used
@@ -323,7 +324,9 @@ void
 CheckPointCLOG(void)
 {
     /* Flush dirty CLOG pages to disk */
+    TRACE_POSTGRESQL_CLOG_CHECKPOINT_START();
     SimpleLruFlush(ClogCtl, true);
+    TRACE_POSTGRESQL_CLOG_CHECKPOINT_DONE();
 }


Index: src/backend/access/transam/multixact.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/multixact.c,v
retrieving revision 1.27
diff -u -3 -p -r1.27 multixact.c
--- src/backend/access/transam/multixact.c    1 Jan 2008 19:45:46 -0000    1.27
+++ src/backend/access/transam/multixact.c    21 Jul 2008 18:21:58 -0000
@@ -57,6 +57,7 @@
 #include "storage/lmgr.h"
 #include "utils/memutils.h"
 #include "storage/procarray.h"
+#include "pg_trace.h"


 /*
@@ -1526,6 +1527,8 @@ MultiXactGetCheckptMulti(bool is_shutdow
 void
 CheckPointMultiXact(void)
 {
+    TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_START();
+
     /* Flush dirty MultiXact pages to disk */
     SimpleLruFlush(MultiXactOffsetCtl, true);
     SimpleLruFlush(MultiXactMemberCtl, true);
@@ -1540,6 +1543,8 @@ CheckPointMultiXact(void)
      */
     if (!InRecovery)
         TruncateMultiXact();
+
+    TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE();
 }

 /*
Index: src/backend/access/transam/slru.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/slru.c,v
retrieving revision 1.44
diff -u -3 -p -r1.44 slru.c
--- src/backend/access/transam/slru.c    1 Jan 2008 19:45:48 -0000    1.44
+++ src/backend/access/transam/slru.c    21 Jul 2008 18:22:00 -0000
@@ -57,6 +57,7 @@
 #include "storage/fd.h"
 #include "storage/shmem.h"
 #include "miscadmin.h"
+#include "pg_trace.h"


 /*
@@ -372,6 +373,8 @@ SimpleLruReadPage(SlruCtl ctl, int pagen
 {
     SlruShared    shared = ctl->shared;

+    TRACE_POSTGRESQL_SLRU_READPAGE_START((uintptr_t)ctl, pageno, write_ok, xid);
+
     /* Outer loop handles restart if we must wait for someone else's I/O */
     for (;;)
     {
@@ -399,6 +402,7 @@ SimpleLruReadPage(SlruCtl ctl, int pagen
             }
             /* Otherwise, it's ready to use */
             SlruRecentlyUsed(shared, slotno);
+            TRACE_POSTGRESQL_SLRU_READPAGE_DONE(slotno);
             return slotno;
         }

@@ -446,6 +450,7 @@ SimpleLruReadPage(SlruCtl ctl, int pagen
             SlruReportIOError(ctl, pageno, xid);

         SlruRecentlyUsed(shared, slotno);
+        TRACE_POSTGRESQL_SLRU_READPAGE_DONE(slotno);
         return slotno;
     }
 }
@@ -470,6 +475,8 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl,
     SlruShared    shared = ctl->shared;
     int            slotno;

+    TRACE_POSTGRESQL_SLRU_READPAGE_READONLY((uintptr_t)ctl, pageno, xid);
+
     /* Try to find the page while holding only shared lock */
     LWLockAcquire(shared->ControlLock, LW_SHARED);

@@ -511,6 +518,8 @@ SimpleLruWritePage(SlruCtl ctl, int slot
     int            pageno = shared->page_number[slotno];
     bool        ok;

+    TRACE_POSTGRESQL_SLRU_WRITEPAGE_START((uintptr_t)ctl, pageno, slotno);
+
     /* If a write is in progress, wait for it to finish */
     while (shared->page_status[slotno] == SLRU_PAGE_WRITE_IN_PROGRESS &&
            shared->page_number[slotno] == pageno)
@@ -525,7 +534,10 @@ SimpleLruWritePage(SlruCtl ctl, int slot
     if (!shared->page_dirty[slotno] ||
         shared->page_status[slotno] != SLRU_PAGE_VALID ||
         shared->page_number[slotno] != pageno)
+    {
+        TRACE_POSTGRESQL_SLRU_WRITEPAGE_DONE();
         return;
+    }

     /*
      * Mark the slot write-busy, and clear the dirtybit.  After this point, a
@@ -569,6 +581,8 @@ SimpleLruWritePage(SlruCtl ctl, int slot
     /* Now it's okay to ereport if we failed */
     if (!ok)
         SlruReportIOError(ctl, pageno, InvalidTransactionId);
+
+    TRACE_POSTGRESQL_SLRU_WRITEPAGE_DONE();
 }

 /*
@@ -593,6 +607,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int pa

     SlruFileName(ctl, path, segno);

+    TRACE_POSTGRESQL_SLRU_READPAGE_PHYSICAL_START((uintptr_t)ctl, path, pageno, slotno);
     /*
      * In a crash-and-restart situation, it's possible for us to receive
      * commands to set the commit status of transactions whose bits are in
@@ -607,6 +622,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int pa
         {
             slru_errcause = SLRU_OPEN_FAILED;
             slru_errno = errno;
+            TRACE_POSTGRESQL_SLRU_READPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
             return false;
         }

@@ -614,6 +630,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int pa
                 (errmsg("file \"%s\" doesn't exist, reading as zeroes",
                         path)));
         MemSet(shared->page_buffer[slotno], 0, BLCKSZ);
+        TRACE_POSTGRESQL_SLRU_READPAGE_PHYSICAL_DONE(true, slru_errcause, slru_errno);
         return true;
     }

@@ -622,6 +639,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int pa
         slru_errcause = SLRU_SEEK_FAILED;
         slru_errno = errno;
         close(fd);
+        TRACE_POSTGRESQL_SLRU_READPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
         return false;
     }

@@ -631,6 +649,7 @@ SlruPhysicalReadPage(SlruCtl ctl, int pa
         slru_errcause = SLRU_READ_FAILED;
         slru_errno = errno;
         close(fd);
+        TRACE_POSTGRESQL_SLRU_READPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
         return false;
     }

@@ -638,9 +657,11 @@ SlruPhysicalReadPage(SlruCtl ctl, int pa
     {
         slru_errcause = SLRU_CLOSE_FAILED;
         slru_errno = errno;
+        TRACE_POSTGRESQL_SLRU_READPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
         return false;
     }

+    TRACE_POSTGRESQL_SLRU_READPAGE_PHYSICAL_DONE(true, slru_errcause, slru_errno);
     return true;
 }

@@ -668,6 +689,8 @@ SlruPhysicalWritePage(SlruCtl ctl, int p
     char        path[MAXPGPATH];
     int            fd = -1;

+    TRACE_POSTGRESQL_SLRU_WRITEPAGE_PHYSICAL_START((uintptr_t)ctl, pageno, slotno);
+
     /*
      * Honor the write-WAL-before-data rule, if appropriate, so that we do not
      * write out data before associated WAL records.  This is the same action
@@ -753,6 +776,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int p
         {
             slru_errcause = SLRU_OPEN_FAILED;
             slru_errno = errno;
+            TRACE_POSTGRESQL_SLRU_WRITEPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
             return false;
         }

@@ -781,6 +805,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int p
         slru_errno = errno;
         if (!fdata)
             close(fd);
+        TRACE_POSTGRESQL_SLRU_WRITEPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
         return false;
     }

@@ -794,6 +819,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int p
         slru_errno = errno;
         if (!fdata)
             close(fd);
+        TRACE_POSTGRESQL_SLRU_WRITEPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
         return false;
     }

@@ -808,6 +834,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int p
             slru_errcause = SLRU_FSYNC_FAILED;
             slru_errno = errno;
             close(fd);
+            TRACE_POSTGRESQL_SLRU_WRITEPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
             return false;
         }

@@ -815,10 +842,12 @@ SlruPhysicalWritePage(SlruCtl ctl, int p
         {
             slru_errcause = SLRU_CLOSE_FAILED;
             slru_errno = errno;
+            TRACE_POSTGRESQL_SLRU_WRITEPAGE_PHYSICAL_DONE(false, slru_errcause, slru_errno);
             return false;
         }
     }

+    TRACE_POSTGRESQL_SLRU_WRITEPAGE_PHYSICAL_DONE(true, slru_errcause, slru_errno);
     return true;
 }

Index: src/backend/access/transam/subtrans.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/subtrans.c,v
retrieving revision 1.22
diff -u -3 -p -r1.22 subtrans.c
--- src/backend/access/transam/subtrans.c    26 Mar 2008 18:48:59 -0000    1.22
+++ src/backend/access/transam/subtrans.c    21 Jul 2008 18:22:01 -0000
@@ -32,6 +32,7 @@
 #include "access/subtrans.h"
 #include "access/transam.h"
 #include "utils/snapmgr.h"
+#include "pg_trace.h"


 /*
@@ -281,7 +282,9 @@ CheckPointSUBTRANS(void)
      * it merely to improve the odds that writing of dirty pages is done by
      * the checkpoint process and not by backends.
      */
+    TRACE_POSTGRESQL_SUBTRANS_CHECKPOINT_START();
     SimpleLruFlush(SubTransCtl, true);
+    TRACE_POSTGRESQL_SUBTRANS_CHECKPOINT_DONE();
 }


Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.43
diff -u -3 -p -r1.43 twophase.c
--- src/backend/access/transam/twophase.c    19 May 2008 18:16:26 -0000    1.43
+++ src/backend/access/transam/twophase.c    21 Jul 2008 18:22:09 -0000
@@ -57,6 +57,7 @@
 #include "storage/smgr.h"
 #include "utils/builtins.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 /*
@@ -1387,6 +1388,9 @@ CheckPointTwoPhase(XLogRecPtr redo_horiz
      */
     if (max_prepared_xacts <= 0)
         return;                    /* nothing to do */
+
+    TRACE_POSTGRESQL_TWOPHASE_CHECKPOINT_START();
+
     xids = (TransactionId *) palloc(max_prepared_xacts * sizeof(TransactionId));
     nxids = 0;

@@ -1444,6 +1448,8 @@ CheckPointTwoPhase(XLogRecPtr redo_horiz
     }

     pfree(xids);
+
+    TRACE_POSTGRESQL_TWOPHASE_CHECKPOINT_DONE();
 }

 /*
Index: src/backend/access/transam/xlog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.316
diff -u -3 -p -r1.316 xlog.c
--- src/backend/access/transam/xlog.c    13 Jul 2008 20:45:47 -0000    1.316
+++ src/backend/access/transam/xlog.c    21 Jul 2008 18:22:23 -0000
@@ -48,6 +48,7 @@
 #include "utils/builtins.h"
 #include "utils/pg_locale.h"
 #include "utils/ps_status.h"
+#include "pg_trace.h"


 /* File path names (all relative to $PGDATA) */
@@ -484,6 +485,8 @@ XLogInsert(RmgrId rmid, uint8 info, XLog
     if (info & XLR_INFO_MASK)
         elog(PANIC, "invalid xlog info mask %02X", info);

+    TRACE_POSTGRESQL_XLOG_INSERT(rmid, info);
+
     /*
      * In bootstrap mode, we don't actually log anything but XLOG resources;
      * return a phony record pointer.
@@ -912,6 +915,8 @@ begin:;
         XLogwrtRqst FlushRqst;
         XLogRecPtr    OldSegEnd;

+        TRACE_POSTGRESQL_XLOG_SWITCH();
+
         LWLockAcquire(WALWriteLock, LW_EXCLUSIVE);

         /*
@@ -1270,12 +1275,14 @@ AdvanceXLInsertBuffer(bool new_segment)
                  * Have to write buffers while holding insert lock. This is
                  * not good, so only write as much as we absolutely must.
                  */
+                TRACE_POSTGRESQL_WAL_BUFFER_WRITE_START();
                 WriteRqst.Write = OldPageRqstPtr;
                 WriteRqst.Flush.xlogid = 0;
                 WriteRqst.Flush.xrecoff = 0;
                 XLogWrite(WriteRqst, false, false);
                 LWLockRelease(WALWriteLock);
                 Insert->LogwrtResult = LogwrtResult;
+                TRACE_POSTGRESQL_WAL_BUFFER_WRITE_DONE();
             }
         }
     }
@@ -5785,6 +5792,7 @@ CreateCheckPoint(int flags)
             LWLockRelease(WALInsertLock);
             LWLockRelease(CheckpointLock);
             END_CRIT_SECTION();
+            TRACE_POSTGRESQL_XLOG_CHECKPOINT_DONE(UINT_MAX,0,0,0,UINT_MAX);
             return;
         }
     }
@@ -5880,6 +5888,8 @@ CreateCheckPoint(int flags)
     checkPoint.nextXid = ShmemVariableCache->nextXid;
     LWLockRelease(XidGenLock);

+    TRACE_POSTGRESQL_XLOG_CHECKPOINT_START(flags, checkPoint.nextXid);
+
     /* Increase XID epoch if we've wrapped around since last checkpoint */
     checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
     if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
@@ -6004,6 +6014,11 @@ CreateCheckPoint(int flags)
     if (log_checkpoints)
         LogCheckpointEnd();

+    TRACE_POSTGRESQL_XLOG_CHECKPOINT_DONE(CheckpointStats.ckpt_bufs_written,
+                NBuffers, CheckpointStats.ckpt_segs_added,
+                CheckpointStats.ckpt_segs_removed,
+                CheckpointStats.ckpt_segs_recycled);
+
     LWLockRelease(CheckpointLock);
 }

Index: src/backend/executor/execScan.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/execScan.c,v
retrieving revision 1.43
diff -u -3 -p -r1.43 execScan.c
--- src/backend/executor/execScan.c    1 Jan 2008 19:45:49 -0000    1.43
+++ src/backend/executor/execScan.c    21 Jul 2008 18:22:23 -0000
@@ -21,6 +21,7 @@
 #include "executor/executor.h"
 #include "miscadmin.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 static bool tlist_matches_tupdesc(PlanState *ps, List *tlist, Index varno, TupleDesc tupdesc);
@@ -60,6 +61,7 @@ ExecScan(ScanState *node,
     qual = node->ps.qual;
     projInfo = node->ps.ps_ProjInfo;

+    TRACE_POSTGRESQL_EXECUTOR_SCAN((uintptr_t)node, ((Scan *)node->ps.plan)->scanrelid, (uintptr_t)accessMtd);
     /*
      * If we have neither a qual to check nor a projection to do, just skip
      * all the overhead and return the raw scan tuple.
Index: src/backend/executor/nodeAgg.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeAgg.c,v
retrieving revision 1.158
diff -u -3 -p -r1.158 nodeAgg.c
--- src/backend/executor/nodeAgg.c    12 May 2008 00:00:49 -0000    1.158
+++ src/backend/executor/nodeAgg.c    21 Jul 2008 18:22:25 -0000
@@ -86,6 +86,7 @@
 #include "utils/syscache.h"
 #include "utils/tuplesort.h"
 #include "utils/datum.h"
+#include "pg_trace.h"


 /*
@@ -805,6 +806,8 @@ ExecAgg(AggState *node)
     if (node->agg_done)
         return NULL;

+    TRACE_POSTGRESQL_EXECUTOR_AGG((uintptr_t)node, ((Agg *) node->ss.ps.plan)->aggstrategy);
+
     if (((Agg *) node->ss.ps.plan)->aggstrategy == AGG_HASHED)
     {
         if (!node->table_filled)
Index: src/backend/executor/nodeGroup.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeGroup.c,v
retrieving revision 1.70
diff -u -3 -p -r1.70 nodeGroup.c
--- src/backend/executor/nodeGroup.c    1 Jan 2008 19:45:49 -0000    1.70
+++ src/backend/executor/nodeGroup.c    21 Jul 2008 18:22:25 -0000
@@ -24,6 +24,7 @@

 #include "executor/executor.h"
 #include "executor/nodeGroup.h"
+#include "pg_trace.h"


 /*
@@ -49,6 +50,8 @@ ExecGroup(GroupState *node)
     numCols = ((Group *) node->ss.ps.plan)->numCols;
     grpColIdx = ((Group *) node->ss.ps.plan)->grpColIdx;

+    TRACE_POSTGRESQL_EXECUTOR_GROUP((uintptr_t)node, numCols);
+
     /*
      * The ScanTupleSlot holds the (copied) first tuple of each group.
      */
Index: src/backend/executor/nodeHash.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeHash.c,v
retrieving revision 1.116
diff -u -3 -p -r1.116 nodeHash.c
--- src/backend/executor/nodeHash.c    1 Jan 2008 19:45:49 -0000    1.116
+++ src/backend/executor/nodeHash.c    21 Jul 2008 18:22:26 -0000
@@ -35,6 +35,7 @@
 #include "utils/dynahash.h"
 #include "utils/memutils.h"
 #include "utils/lsyscache.h"
+#include "pg_trace.h"


 static void ExecHashIncreaseNumBatches(HashJoinTable hashtable);
@@ -70,6 +71,8 @@ MultiExecHash(HashState *node)
     ExprContext *econtext;
     uint32        hashvalue;

+    TRACE_POSTGRESQL_EXECUTOR_HASH_MULTI((uintptr_t)node);
+
     /* must provide our own instrumentation support */
     if (node->ps.instrument)
         InstrStartNode(node->ps.instrument);
Index: src/backend/executor/nodeHashjoin.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeHashjoin.c,v
retrieving revision 1.93
diff -u -3 -p -r1.93 nodeHashjoin.c
--- src/backend/executor/nodeHashjoin.c    1 Jan 2008 19:45:49 -0000    1.93
+++ src/backend/executor/nodeHashjoin.c    21 Jul 2008 18:22:27 -0000
@@ -20,6 +20,7 @@
 #include "executor/nodeHash.h"
 #include "executor/nodeHashjoin.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 static TupleTableSlot *ExecHashJoinOuterGetTuple(PlanState *outerNode,
@@ -58,6 +59,8 @@ ExecHashJoin(HashJoinState *node)
     uint32        hashvalue;
     int            batchno;

+    TRACE_POSTGRESQL_EXECUTOR_HASHJOIN((uintptr_t)node);
+
     /*
      * get information from HashJoin node
      */
Index: src/backend/executor/nodeLimit.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeLimit.c,v
retrieving revision 1.34
diff -u -3 -p -r1.34 nodeLimit.c
--- src/backend/executor/nodeLimit.c    10 Mar 2008 03:37:59 -0000    1.34
+++ src/backend/executor/nodeLimit.c    21 Jul 2008 18:22:28 -0000
@@ -23,6 +23,7 @@

 #include "executor/executor.h"
 #include "executor/nodeLimit.h"
+#include "pg_trace.h"

 static void recompute_limits(LimitState *node);

@@ -41,6 +42,8 @@ ExecLimit(LimitState *node)
     TupleTableSlot *slot;
     PlanState  *outerPlan;

+    TRACE_POSTGRESQL_EXECUTOR_LIMIT((uintptr_t)node);
+
     /*
      * get information from the node
      */
Index: src/backend/executor/nodeMaterial.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeMaterial.c,v
retrieving revision 1.62
diff -u -3 -p -r1.62 nodeMaterial.c
--- src/backend/executor/nodeMaterial.c    23 Mar 2008 00:54:04 -0000    1.62
+++ src/backend/executor/nodeMaterial.c    21 Jul 2008 18:22:28 -0000
@@ -24,6 +24,7 @@
 #include "executor/executor.h"
 #include "executor/nodeMaterial.h"
 #include "miscadmin.h"
+#include "pg_trace.h"

 /* ----------------------------------------------------------------
  *        ExecMaterial
@@ -45,6 +46,8 @@ ExecMaterial(MaterialState *node)
     bool        eof_tuplestore;
     TupleTableSlot *slot;

+    TRACE_POSTGRESQL_EXECUTOR_MATERIAL((uintptr_t)node);
+
     /*
      * get state info from node
      */
Index: src/backend/executor/nodeMergejoin.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeMergejoin.c,v
retrieving revision 1.91
diff -u -3 -p -r1.91 nodeMergejoin.c
--- src/backend/executor/nodeMergejoin.c    13 Apr 2008 20:51:20 -0000    1.91
+++ src/backend/executor/nodeMergejoin.c    21 Jul 2008 18:22:30 -0000
@@ -102,6 +102,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/syscache.h"
+#include "pg_trace.h"


 /*
@@ -565,6 +566,8 @@ ExecMergeJoin(MergeJoinState *node)
     bool        doFillOuter;
     bool        doFillInner;

+    TRACE_POSTGRESQL_EXECUTOR_MERGEJOIN((uintptr_t)node);
+
     /*
      * get information from node
      */
Index: src/backend/executor/nodeNestloop.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeNestloop.c,v
retrieving revision 1.46
diff -u -3 -p -r1.46 nodeNestloop.c
--- src/backend/executor/nodeNestloop.c    1 Jan 2008 19:45:49 -0000    1.46
+++ src/backend/executor/nodeNestloop.c    21 Jul 2008 18:22:30 -0000
@@ -24,6 +24,7 @@
 #include "executor/execdebug.h"
 #include "executor/nodeNestloop.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 /* ----------------------------------------------------------------
@@ -67,6 +68,8 @@ ExecNestLoop(NestLoopState *node)
     List       *otherqual;
     ExprContext *econtext;

+    TRACE_POSTGRESQL_EXECUTOR_NESTLOOP((uintptr_t)node);
+
     /*
      * get information from the node
      */
Index: src/backend/executor/nodeSetOp.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeSetOp.c,v
retrieving revision 1.25
diff -u -3 -p -r1.25 nodeSetOp.c
--- src/backend/executor/nodeSetOp.c    1 Jan 2008 19:45:49 -0000    1.25
+++ src/backend/executor/nodeSetOp.c    21 Jul 2008 18:22:31 -0000
@@ -37,6 +37,7 @@
 #include "executor/executor.h"
 #include "executor/nodeSetOp.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 /* ----------------------------------------------------------------
@@ -50,6 +51,8 @@ ExecSetOp(SetOpState *node)
     TupleTableSlot *resultTupleSlot;
     PlanState  *outerPlan;

+    TRACE_POSTGRESQL_EXECUTOR_SETOP((uintptr_t)node);
+
     /*
      * get information from the node
      */
Index: src/backend/executor/nodeSort.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeSort.c,v
retrieving revision 1.62
diff -u -3 -p -r1.62 nodeSort.c
--- src/backend/executor/nodeSort.c    1 Jan 2008 19:45:49 -0000    1.62
+++ src/backend/executor/nodeSort.c    21 Jul 2008 18:22:32 -0000
@@ -19,6 +19,7 @@
 #include "executor/nodeSort.h"
 #include "miscadmin.h"
 #include "utils/tuplesort.h"
+#include "pg_trace.h"


 /* ----------------------------------------------------------------
@@ -53,6 +54,8 @@ ExecSort(SortState *node)
     dir = estate->es_direction;
     tuplesortstate = (Tuplesortstate *) node->tuplesortstate;

+    TRACE_POSTGRESQL_EXECUTOR_SORT((uintptr_t)node, dir);
+
     /*
      * If first time through, read all tuples from outer plan and pass them to
      * tuplesort.c. Subsequent calls just fetch tuples from tuplesort.
Index: src/backend/executor/nodeSubplan.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeSubplan.c,v
retrieving revision 1.93
diff -u -3 -p -r1.93 nodeSubplan.c
--- src/backend/executor/nodeSubplan.c    12 May 2008 00:00:49 -0000    1.93
+++ src/backend/executor/nodeSubplan.c    21 Jul 2008 18:22:35 -0000
@@ -27,6 +27,7 @@
 #include "utils/array.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 static Datum ExecHashSubPlan(SubPlanState *node,
@@ -80,6 +81,8 @@ ExecHashSubPlan(SubPlanState *node,
     ExprContext *innerecontext = node->innerecontext;
     TupleTableSlot *slot;

+    TRACE_POSTGRESQL_EXECUTOR_SUBPLAN_HASH((uintptr_t)node);
+
     /* Shouldn't have any direct correlation Vars */
     if (subplan->parParam != NIL || node->args != NIL)
         elog(ERROR, "hashed subplan with direct correlation not supported");
@@ -215,6 +218,8 @@ ExecScanSubPlan(SubPlanState *node,
     ListCell   *l;
     ArrayBuildState *astate = NULL;

+    TRACE_POSTGRESQL_EXECUTOR_SUBPLAN_SCAN((uintptr_t)node);
+
     /*
      * We are probably in a short-lived expression-evaluation context. Switch
      * to the per-query context for manipulating the child plan's chgParam,
Index: src/backend/executor/nodeUnique.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/executor/nodeUnique.c,v
retrieving revision 1.56
diff -u -3 -p -r1.56 nodeUnique.c
--- src/backend/executor/nodeUnique.c    1 Jan 2008 19:45:49 -0000    1.56
+++ src/backend/executor/nodeUnique.c    21 Jul 2008 18:22:36 -0000
@@ -28,6 +28,7 @@
 #include "executor/executor.h"
 #include "executor/nodeUnique.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 /* ----------------------------------------------------------------
@@ -45,6 +46,8 @@ ExecUnique(UniqueState *node)
     TupleTableSlot *slot;
     PlanState  *outerPlan;

+    TRACE_POSTGRESQL_EXECUTOR_UNIQUE((uintptr_t)node);
+
     /*
      * get information from the node
      */
Index: src/backend/postmaster/autovacuum.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.82
diff -u -3 -p -r1.82 autovacuum.c
--- src/backend/postmaster/autovacuum.c    21 Jul 2008 15:27:02 -0000    1.82
+++ src/backend/postmaster/autovacuum.c    21 Jul 2008 18:22:39 -0000
@@ -100,6 +100,7 @@
 #include "utils/ps_status.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
+#include "pg_trace.h"


 /*
@@ -1601,6 +1602,7 @@ AutoVacWorkerMain(int argc, char *argv[]
         InitPostgres(NULL, dbid, NULL, &dbname);
         SetProcessingMode(NormalProcessing);
         set_ps_display(dbname, false);
+        TRACE_POSTGRESQL_AUTOVACUUM_START(dbid, dbname);
         ereport(DEBUG1,
                 (errmsg("autovacuum: processing database \"%s\"", dbname)));

@@ -2177,6 +2179,7 @@ do_autovacuum(void)
         {
             /* have at it */
             MemoryContextSwitchTo(TopTransactionContext);
+            TRACE_POSTGRESQL_AUTOVACUUM_RELATION(tab->at_relid, tab->at_datname, tab->at_nspname, tab->at_relname);
             autovacuum_do_vac_analyze(tab, bstrategy);

             /*
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.176
diff -u -3 -p -r1.176 pgstat.c
--- src/backend/postmaster/pgstat.c    30 Jun 2008 10:58:47 -0000    1.176
+++ src/backend/postmaster/pgstat.c    21 Jul 2008 18:22:44 -0000
@@ -61,6 +61,7 @@
 #include "utils/ps_status.h"
 #include "utils/rel.h"
 #include "utils/tqual.h"
+#include "pg_trace.h"


 /* ----------
@@ -2202,6 +2203,8 @@ pgstat_report_activity(const char *cmd_s
     TimestampTz start_timestamp;
     int            len;

+    TRACE_POSTGRESQL_STATEMENT_STATUS(cmd_str);
+
     if (!pgstat_track_activities || !beentry)
         return;

Index: src/backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.234
diff -u -3 -p -r1.234 bufmgr.c
--- src/backend/storage/buffer/bufmgr.c    13 Jul 2008 20:45:47 -0000    1.234
+++ src/backend/storage/buffer/bufmgr.c    21 Jul 2008 18:22:48 -0000
@@ -43,6 +43,7 @@
 #include "utils/rel.h"
 #include "utils/resowner.h"
 #include "pgstat.h"
+#include "pg_trace.h"


 /* Note: these two macros only work on shared buffers, not local ones! */
@@ -213,6 +214,9 @@ ReadBuffer_common(SMgrRelation smgr, boo
     if (isExtend)
         blockNum = smgrnblocks(smgr);

+    TRACE_POSTGRESQL_BUFFER_READ_START(blockNum, smgr->smgr_rnode.spcNode,
+        smgr->smgr_rnode.dbNode, smgr->smgr_rnode.relNode, isLocalBuf);
+
     if (isLocalBuf)
     {
         ReadLocalBufferCount++;
@@ -240,12 +244,19 @@ ReadBuffer_common(SMgrRelation smgr, boo
     {
         if (!isExtend)
         {
+            TRACE_POSTGRESQL_BUFFER_HIT();
+
             /* Just need to update stats before we exit */
             *hit = true;

             if (VacuumCostActive)
                 VacuumCostBalance += VacuumCostPageHit;

+            TRACE_POSTGRESQL_BUFFER_READ_DONE(blockNum,
+                smgr->smgr_rnode.spcNode,
+                smgr->smgr_rnode.dbNode,
+                smgr->smgr_rnode.relNode, isLocalBuf, found);
+
             return BufferDescriptorGetBuffer(bufHdr);
         }

@@ -297,6 +308,7 @@ ReadBuffer_common(SMgrRelation smgr, boo
             } while (!StartBufferIO(bufHdr, true));
         }
     }
+    TRACE_POSTGRESQL_BUFFER_MISS();

     /*
      * if we have gotten to this point, we have allocated a buffer for the
@@ -368,6 +380,10 @@ ReadBuffer_common(SMgrRelation smgr, boo
     if (VacuumCostActive)
         VacuumCostBalance += VacuumCostPageMiss;

+    TRACE_POSTGRESQL_BUFFER_READ_DONE(blockNum, smgr->smgr_rnode.spcNode,
+            smgr->smgr_rnode.dbNode, smgr->smgr_rnode.relNode,
+            isLocalBuf, found);
+
     return BufferDescriptorGetBuffer(bufHdr);
 }

@@ -510,6 +526,11 @@ BufferAlloc(SMgrRelation smgr,
              * happens to be trying to split the page the first one got from
              * StrategyGetBuffer.)
              */
+
+            TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(blockNum,
+              smgr->smgr_rnode.spcNode, smgr->smgr_rnode.dbNode,
+              smgr->smgr_rnode.relNode);
+
             if (LWLockConditionalAcquire(buf->content_lock, LW_SHARED))
             {
                 /*
@@ -532,6 +553,11 @@ BufferAlloc(SMgrRelation smgr,
                 /* OK, do the I/O */
                 FlushBuffer(buf, NULL);
                 LWLockRelease(buf->content_lock);
+
+                TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(
+                  blockNum, smgr->smgr_rnode.spcNode,
+                  smgr->smgr_rnode.dbNode, smgr->smgr_rnode.relNode,
+                  (oldFlags & BM_HINT_BITS_TEST));
             }
             else
             {
@@ -850,7 +876,9 @@ MarkBufferDirty(Buffer buffer)
     if (!(bufHdr->flags & BM_DIRTY) && VacuumCostActive)
         VacuumCostBalance += VacuumCostPageDirty;

-    bufHdr->flags |= (BM_DIRTY | BM_JUST_DIRTIED);
+    bufHdr->flags |= (BM_DIRTY | BM_JUST_DIRTIED | BM_HINT_BITS_TEST);
+
+    TRACE_POSTGRESQL_BUFFER_MARK_DIRTY(buffer);

     UnlockBufHdr(bufHdr);
 }
@@ -1086,6 +1114,8 @@ BufferSync(int flags)
     if (num_to_write == 0)
         return;                    /* nothing to do */

+    TRACE_POSTGRESQL_BUFFER_SYNC_START(NBuffers, num_to_write);
+
     /*
      * Loop over all buffers again, and write the ones (still) marked with
      * BM_CHECKPOINT_NEEDED.  In this loop, we start at the clock sweep point
@@ -1117,6 +1147,7 @@ BufferSync(int flags)
         {
             if (SyncOneBuffer(buf_id, false) & BUF_WRITTEN)
             {
+                TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
                 BgWriterStats.m_buf_written_checkpoints++;
                 num_written++;

@@ -1147,6 +1178,8 @@ BufferSync(int flags)
             buf_id = 0;
     }

+    TRACE_POSTGRESQL_BUFFER_SYNC_DONE(NBuffers, num_written, num_to_write);
+
     /*
      * Update checkpoint statistics. As noted above, this doesn't include
      * buffers written by other backends or bgwriter scan.
@@ -1653,11 +1686,13 @@ PrintBufferLeakWarning(Buffer buffer)
 void
 CheckPointBuffers(int flags)
 {
+    TRACE_POSTGRESQL_BUFFER_CHECKPOINT_START(flags);
     CheckpointStats.ckpt_write_t = GetCurrentTimestamp();
     BufferSync(flags);
     CheckpointStats.ckpt_sync_t = GetCurrentTimestamp();
     smgrsync();
     CheckpointStats.ckpt_sync_end_t = GetCurrentTimestamp();
+    TRACE_POSTGRESQL_BUFFER_CHECKPOINT_DONE();
 }


@@ -1759,6 +1794,10 @@ FlushBuffer(volatile BufferDesc *buf, SM
     if (reln == NULL)
         reln = smgropen(buf->tag.rnode);

+    TRACE_POSTGRESQL_BUFFER_FLUSH_START(reln->smgr_rnode.spcNode,
+         reln->smgr_rnode.dbNode,
+         reln->smgr_rnode.relNode, buf->tag.blockNum);
+
     /*
      * Force XLOG flush up to buffer's LSN.  This implements the basic WAL
      * rule that log updates must hit disk before any of the data-file changes
@@ -1785,6 +1824,10 @@ FlushBuffer(volatile BufferDesc *buf, SM

     BufferFlushCount++;

+    TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(reln->smgr_rnode.spcNode,
+         reln->smgr_rnode.dbNode, reln->smgr_rnode.relNode,
+         buf->tag.blockNum, (buf->flags & BM_HINT_BITS_TEST));
+
     /*
      * Mark the buffer as clean (unless BM_JUST_DIRTIED has become set) and
      * end the io_in_progress state.
@@ -2536,7 +2579,7 @@ TerminateBufferIO(volatile BufferDesc *b
     Assert(buf->flags & BM_IO_IN_PROGRESS);
     buf->flags &= ~(BM_IO_IN_PROGRESS | BM_IO_ERROR);
     if (clear_dirty && !(buf->flags & BM_JUST_DIRTIED))
-        buf->flags &= ~(BM_DIRTY | BM_CHECKPOINT_NEEDED);
+        buf->flags &= ~(BM_DIRTY | BM_CHECKPOINT_NEEDED | BM_HINT_BITS_TEST);
     buf->flags |= set_flag_bits;

     UnlockBufHdr(buf);
Index: src/backend/storage/buffer/localbuf.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/localbuf.c,v
retrieving revision 1.80
diff -u -3 -p -r1.80 localbuf.c
--- src/backend/storage/buffer/localbuf.c    12 Jun 2008 09:12:31 -0000    1.80
+++ src/backend/storage/buffer/localbuf.c    21 Jul 2008 18:22:48 -0000
@@ -21,6 +21,7 @@
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/resowner.h"
+#include "pg_trace.h"


 /*#define LBDEBUG*/
@@ -236,6 +237,7 @@ MarkLocalBufferDirty(Buffer buffer)

     bufHdr = &LocalBufferDescriptors[bufid];
     bufHdr->flags |= BM_DIRTY;
+    TRACE_POSTGRESQL_BUFFER_MARK_DIRTY_LOCAL(buffer);
 }

 /*
Index: src/backend/storage/lmgr/deadlock.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/lmgr/deadlock.c,v
retrieving revision 1.53
diff -u -3 -p -r1.53 deadlock.c
--- src/backend/storage/lmgr/deadlock.c    24 Mar 2008 18:22:36 -0000    1.53
+++ src/backend/storage/lmgr/deadlock.c    21 Jul 2008 18:22:49 -0000
@@ -30,6 +30,7 @@
 #include "storage/lmgr.h"
 #include "storage/proc.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 /* One edge in the waits-for graph */
@@ -222,6 +223,8 @@ DeadLockCheck(PGPROC *proc)
          */
         int            nSoftEdges;

+        TRACE_POSTGRESQL_DEADLOCK_FOUND();
+
         nWaitOrders = 0;
         if (!FindLockCycle(proc, possibleConstraints, &nSoftEdges))
             elog(FATAL, "deadlock seems to have disappeared");
@@ -259,6 +262,8 @@ DeadLockCheck(PGPROC *proc)
         ProcLockWakeup(GetLocksMethodTable(lock), lock);
     }

+    TRACE_POSTGRESQL_DEADLOCK_NOTFOUND(nWaitOrders);
+
     /* Return code tells caller if we had to escape a deadlock or not */
     if (nWaitOrders > 0)
         return DS_SOFT_DEADLOCK;
Index: src/backend/storage/smgr/md.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/smgr/md.c,v
retrieving revision 1.138
diff -u -3 -p -r1.138 md.c
--- src/backend/storage/smgr/md.c    2 May 2008 01:08:27 -0000    1.138
+++ src/backend/storage/smgr/md.c    21 Jul 2008 18:22:50 -0000
@@ -26,6 +26,7 @@
 #include "storage/smgr.h"
 #include "utils/hsearch.h"
 #include "utils/memutils.h"
+#include "pg_trace.h"


 /* interval for calling AbsorbFsyncRequests in mdsync */
@@ -555,6 +556,8 @@ mdread(SMgrRelation reln, BlockNumber bl
     int            nbytes;
     MdfdVec    *v;

+    TRACE_POSTGRESQL_SMGR_READ_START(blocknum, reln->smgr_rnode.spcNode, reln->smgr_rnode.dbNode,
reln->smgr_rnode.relNode);
+
     v = _mdfd_getseg(reln, blocknum, false, EXTENSION_FAIL);

     seekpos = (off_t) BLCKSZ * (blocknum % ((BlockNumber) RELSEG_SIZE));
@@ -569,7 +572,11 @@ mdread(SMgrRelation reln, BlockNumber bl
                         reln->smgr_rnode.dbNode,
                         reln->smgr_rnode.relNode)));

-    if ((nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ)) != BLCKSZ)
+    nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ);
+
+    TRACE_POSTGRESQL_SMGR_READ_DONE(blocknum, reln->smgr_rnode.spcNode, reln->smgr_rnode.dbNode,
reln->smgr_rnode.relNode,nbytes, BLCKSZ); 
+
+    if (nbytes != BLCKSZ)
     {
         if (nbytes < 0)
             ereport(ERROR,
@@ -621,6 +628,8 @@ mdwrite(SMgrRelation reln, BlockNumber b
     Assert(blocknum < mdnblocks(reln));
 #endif

+    TRACE_POSTGRESQL_SMGR_WRITE_START(blocknum, reln->smgr_rnode.spcNode, reln->smgr_rnode.dbNode,
reln->smgr_rnode.relNode);
+
     v = _mdfd_getseg(reln, blocknum, isTemp, EXTENSION_FAIL);

     seekpos = (off_t) BLCKSZ * (blocknum % ((BlockNumber) RELSEG_SIZE));
@@ -657,6 +666,8 @@ mdwrite(SMgrRelation reln, BlockNumber b
                  errhint("Check free disk space.")));
     }

+    TRACE_POSTGRESQL_SMGR_WRITE_DONE(blocknum, reln->smgr_rnode.spcNode, reln->smgr_rnode.dbNode,
reln->smgr_rnode.relNode,nbytes, BLCKSZ); 
+
     if (!isTemp)
         register_dirty_segment(reln, v);
 }
Index: src/backend/tcop/postgres.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/postgres.c,v
retrieving revision 1.554
diff -u -3 -p -r1.554 postgres.c
--- src/backend/tcop/postgres.c    18 Jul 2008 20:26:06 -0000    1.554
+++ src/backend/tcop/postgres.c    21 Jul 2008 18:22:54 -0000
@@ -71,6 +71,7 @@
 #include "mb/pg_wchar.h"

 #include "pgstat.h"
+#include "pg_trace.h"

 extern int    optind;
 extern char *optarg;
@@ -551,6 +552,8 @@ pg_parse_query(const char *query_string)
 {
     List       *raw_parsetree_list;

+    TRACE_POSTGRESQL_QUERY_PARSE_START(query_string);
+
     if (log_parser_stats)
         ResetUsage();

@@ -660,6 +663,8 @@ pg_rewrite_query(Query *query)
         elog_node_display(DEBUG1, "rewritten parse tree", querytree_list,
                           Debug_pretty_print);

+    TRACE_POSTGRESQL_QUERY_PARSE_DONE();
+
     return querytree_list;
 }

@@ -673,6 +678,8 @@ pg_plan_query(Query *querytree, int curs
 {
     PlannedStmt *plan;

+    TRACE_POSTGRESQL_QUERY_PLAN_START();
+
     /* Utility commands have no plans. */
     if (querytree->commandType == CMD_UTILITY)
         return NULL;
@@ -710,6 +717,7 @@ pg_plan_query(Query *querytree, int curs
      */
     if (Debug_print_plan)
         elog_node_display(DEBUG1, "plan", plan, Debug_pretty_print);
+    TRACE_POSTGRESQL_QUERY_PLAN_DONE();

     return plan;
 }
@@ -785,6 +793,7 @@ exec_simple_query(const char *query_stri
     bool        isTopLevel;
     char        msec_str[32];

+
     /*
      * Report query to various monitoring facilities.
      */
@@ -792,6 +801,8 @@ exec_simple_query(const char *query_stri

     pgstat_report_activity(query_string);

+    TRACE_POSTGRESQL_QUERY_START(query_string);
+
     /*
      * We use save_log_statement_stats so ShowUsage doesn't report incorrect
      * results because ResetUsage wasn't called.
@@ -1058,6 +1069,8 @@ exec_simple_query(const char *query_stri
     if (save_log_statement_stats)
         ShowUsage("QUERY STATISTICS");

+    TRACE_POSTGRESQL_QUERY_DONE(query_string);
+
     debug_query_string = NULL;
 }

@@ -3493,6 +3506,7 @@ PostgresMain(int argc, char *argv[], con
         {
             if (IsTransactionOrTransactionBlock())
             {
+                TRACE_POSTGRESQL_TRANSACTION_IDLE_START(GetCurrentTransactionId(), GetCurrentCommandId(false));
                 set_ps_display("idle in transaction", false);
                 pgstat_report_activity("<IDLE> in transaction");
             }
@@ -3522,6 +3536,8 @@ PostgresMain(int argc, char *argv[], con
          */
         firstchar = ReadCommand(&input_message);

+        TRACE_POSTGRESQL_TRANSACTION_IDLE_DONE();
+
         /*
          * (4) disable async signal conditions again.
          */
Index: src/backend/tcop/pquery.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/pquery.c,v
retrieving revision 1.123
diff -u -3 -p -r1.123 pquery.c
--- src/backend/tcop/pquery.c    12 May 2008 20:02:02 -0000    1.123
+++ src/backend/tcop/pquery.c    21 Jul 2008 18:22:56 -0000
@@ -24,6 +24,7 @@
 #include "tcop/utility.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
+#include "pg_trace.h"


 /*
@@ -711,6 +712,8 @@ PortalRun(Portal portal, long count, boo

     AssertArg(PortalIsValid(portal));

+    TRACE_POSTGRESQL_QUERY_EXECUTE_START();
+
     /* Initialize completion tag to empty string */
     if (completionTag)
         completionTag[0] = '\0';
@@ -857,6 +860,8 @@ PortalRun(Portal portal, long count, boo

     if (log_executor_stats && portal->strategy != PORTAL_MULTI_QUERY)
         ShowUsage("EXECUTOR STATISTICS");
+
+    TRACE_POSTGRESQL_QUERY_EXECUTE_DONE();

     return result;
 }
@@ -1237,6 +1242,8 @@ PortalRunMulti(Portal portal, bool isTop
              */
             PlannedStmt *pstmt = (PlannedStmt *) stmt;

+            TRACE_POSTGRESQL_QUERY_EXECUTE_START();
+
             if (log_executor_stats)
                 ResetUsage();

@@ -1257,6 +1264,8 @@ PortalRunMulti(Portal portal, bool isTop

             if (log_executor_stats)
                 ShowUsage("EXECUTOR STATISTICS");
+
+            TRACE_POSTGRESQL_QUERY_EXECUTE_DONE();
         }
         else
         {
Index: src/backend/utils/probes.d
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/probes.d,v
retrieving revision 1.2
diff -u -3 -p -r1.2 probes.d
--- src/backend/utils/probes.d    2 Jan 2008 02:42:06 -0000    1.2
+++ src/backend/utils/probes.d    21 Jul 2008 18:22:56 -0000
@@ -7,18 +7,123 @@
  * ----------
  */

+
 provider postgresql {

-probe transaction__start(int);
-probe transaction__commit(int);
-probe transaction__abort(int);
-probe lwlock__acquire(int, int);
-probe lwlock__release(int);
-probe lwlock__startwait(int, int);
-probe lwlock__endwait(int, int);
-probe lwlock__condacquire(int, int);
-probe lwlock__condacquire__fail(int, int);
-probe lock__startwait(int, int);
-probe lock__endwait(int, int);
+    /*
+     * Due to a bug in Mac OS X 10.5, using defined types (e.g. uintptr_t,
+     * uint32_t, etc.) cause compilation error.
+     */
+
+    probe transaction__start(unsigned int transactionId);
+    probe transaction__commit(unsigned int transactionId);
+    probe transaction__abort(unsigned int transactionId);
+    probe transaction__idle__start(unsigned int currentTxnId, unsigned int currentCmdId);
+    probe transaction__idle__done();
+
+    probe lwlock__acquire(unsigned int lockId, unsigned int mode);
+    probe lwlock__release(unsigned int lockId);
+    probe lwlock__startwait(unsigned int lockId, unsigned int mode);
+    probe lwlock__endwait(unsigned int lockId, unsigned int mode);
+    probe lwlock__condacquire(unsigned int lockId, unsigned int mode);
+    probe lwlock__condacquire__fail(unsigned int lockId, unsigned int mode);
+    probe lock__startwait(unsigned int lockTagField2, int lockMode);
+    probe lock__endwait(unsigned int lockTagField2, int lockMode);
+
+    probe query__parse__start(char *queryString);
+    probe query__parse__done();
+    probe query__plan__start();
+    probe query__plan__done();
+    probe query__execute__start();
+    probe query__execute__done();
+    probe query__start(char *queryString);
+    probe query__done(char *queryString);
+    probe statement__status(char *cmdString);
+
+    probe sort__start(int, int, int, int, int);
+    probe sort__end(int, long);
+
+    probe buffer__read__start(int blockNum, int tablespaceOid,
+            int databaseOid, int relationOid, int isLocalBuf);
+    probe buffer__read__done(int blockNum, int tablespaceOid,
+             int databaseOid, int relationOid, int isLocalBuf,
+            int isInBuffPool);
+    probe buffer__write__dirty__start(int blockNum, int tablespaceOid,
+            int databaseOid, int relationOid);
+    probe buffer__write__dirty__done(int blockNum, int tablespaceOid,
+            int databaseOid, int relationOid, int isHintBitsWrite);
+
+    probe buffer__flush__start(int tablespaceOid,
+            int databaseOid, int relationOid, int blockNum);
+    probe buffer__flush__done(int tablespaceOid, int databaseOid,
+            int relationOid, int blockNum, int isHintBitsWrite);
+    probe buffer__hit();
+    probe buffer__miss();
+    probe buffer__mark__dirty(int bufferId);
+    probe buffer__mark__dirty__local(int bufferId);
+    probe buffer__checkpoint__start(int);
+    probe buffer__checkpoint__done();
+    probe buffer__sync__start(int numBuffers, int numToWrite);
+    probe buffer__sync__written(int bufId);
+    probe buffer__sync__done(int numBuffers, int numWritten, int numToWrite);
+
+    probe wal__buffer__write__start();
+    probe wal__buffer__write__done();
+
+    probe deadlock__found();
+    probe deadlock__notfound(int);
+
+    probe smgr__read__start(int blockNum, int tablespaceOid,
+            int databaseOid, int relationOid);
+    probe smgr__read__done(int blockNum, int tablespaceOid,
+             int databaseOid, int relationOid, int bytesRead,
+            int blockSize);
+    probe smgr__write__start(int blockNum, int tablespaceOid,
+            int databaseOid, int relationOid);
+    probe smgr__write__done(int blockNum, int tablespaceOid,
+            int databaseOid, int relationOid, int bytesRead,
+            int blockSize);
+
+    probe xlog__insert(unsigned char, unsigned char);
+    probe xlog__switch();
+
+    probe xlog__checkpoint__start(int checkpointFlags, unsigned int transactionId);
+    probe xlog__checkpoint__done(int bufsWritten, int NBuffers, int segsAdded,
+            int segsRemoved, int segsRecycled);
+    probe clog__checkpoint__start();
+    probe clog__checkpoint__done();
+    probe subtrans__checkpoint__start();
+    probe subtrans__checkpoint__done();
+    probe multixact__checkpoint__start();
+    probe multixact__checkpoint__done();
+    probe twophase__checkpoint__start();
+    probe twophase__checkpoint__done();
+
+    probe autovacuum__start(unsigned int databaseOid, char *databaseName);
+    probe autovacuum__relation(unsigned int, char *, char *, char *);
+
+    probe slru__readpage__start(unsigned int, int pageNum, int writeOk, int xid);
+    probe slru__readpage__done(int slotNum);
+    probe slru__readpage__readonly(unsigned int, int, int);
+    probe slru__writepage__start(unsigned int, int pageNum, int slotNum);
+    probe slru__writepage__done();
+    probe slru__readpage__physical__start(unsigned int, char *path, int pageNum, int slotNum);
+    probe slru__readpage__physical__done(int retVal, int errCause, int errNo);
+    probe slru__writepage__physical__start(unsigned int, int pageNum, int slotNum);
+    probe slru__writepage__physical__done(int retVal, int errCause, int errNo);

+    probe executor__scan(unsigned int, unsigned int, unsigned int);
+    probe executor__agg(unsigned int, int);
+    probe executor__group(unsigned int, int);
+    probe executor__hash__multi(unsigned int);
+    probe executor__hashjoin(unsigned int);
+    probe executor__limit(unsigned int);
+    probe executor__material(unsigned int);
+    probe executor__mergejoin(unsigned int);
+    probe executor__nestloop(unsigned int);
+    probe executor__setop(unsigned int);
+    probe executor__sort(unsigned int, int);
+    probe executor__subplan__hash(unsigned int);
+    probe executor__subplan__scan(unsigned int);
+    probe executor__unique(unsigned int);
 };
Index: src/backend/utils/sort/tuplesort.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/sort/tuplesort.c,v
retrieving revision 1.85
diff -u -3 -p -r1.85 tuplesort.c
--- src/backend/utils/sort/tuplesort.c    19 Jun 2008 00:46:05 -0000    1.85
+++ src/backend/utils/sort/tuplesort.c    21 Jul 2008 18:23:00 -0000
@@ -115,12 +115,18 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/tuplesort.h"
+#include "pg_trace.h"


 /* GUC variables */
 #ifdef TRACE_SORT
 bool        trace_sort = false;
 #endif
+
+#define HEAP_SORT    0
+#define INDEX_SORT    1
+#define DATUM_SORT    2
+
 #ifdef DEBUG_BOUNDED_SORT
 bool        optimize_bounded_sort = true;
 #endif
@@ -569,6 +575,7 @@ tuplesort_begin_heap(TupleDesc tupDesc,
              "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
              nkeys, workMem, randomAccess ? 't' : 'f');
 #endif
+    TRACE_POSTGRESQL_SORT_START(HEAP_SORT, false, nkeys, workMem, randomAccess);

     state->nKeys = nkeys;

@@ -636,6 +643,8 @@ tuplesort_begin_index_btree(Relation ind

     state->nKeys = RelationGetNumberOfAttributes(indexRel);

+    TRACE_POSTGRESQL_SORT_START(INDEX_SORT, enforceUnique, state->nKeys, workMem, randomAccess);
+
     state->comparetup = comparetup_index_btree;
     state->copytup = copytup_index;
     state->writetup = writetup_index;
@@ -713,6 +722,7 @@ tuplesort_begin_datum(Oid datumType,
              "begin datum sort: workMem = %d, randomAccess = %c",
              workMem, randomAccess ? 't' : 'f');
 #endif
+    TRACE_POSTGRESQL_SORT_START(DATUM_SORT, false, 1, workMem, randomAccess);

     state->nKeys = 1;            /* always a one-column sort */

@@ -825,6 +835,11 @@ tuplesort_end(Tuplesortstate *state)
     }
 #endif

+    TRACE_POSTGRESQL_SORT_END(state->tapeset,
+            (state->tapeset ? LogicalTapeSetBlocks(state->tapeset) :
+            (state->allowedMem - state->availMem + 1023) / 1024));
+
+
     MemoryContextSwitchTo(oldcontext);

     /*
Index: src/include/storage/buf_internals.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/storage/buf_internals.h,v
retrieving revision 1.97
diff -u -3 -p -r1.97 buf_internals.h
--- src/include/storage/buf_internals.h    19 Jun 2008 00:46:06 -0000    1.97
+++ src/include/storage/buf_internals.h    21 Jul 2008 18:23:01 -0000
@@ -37,6 +37,7 @@
 #define BM_JUST_DIRTIED            (1 << 5)        /* dirtied since write started */
 #define BM_PIN_COUNT_WAITER        (1 << 6)        /* have waiter for sole pin */
 #define BM_CHECKPOINT_NEEDED    (1 << 7)        /* must write for checkpoint */
+#define BM_HINT_BITS_TEST    (1 << 8)        /* test effect of writes due to hint bits */

 typedef bits16 BufFlags;


pgsql-hackers by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: Load spikes on 8.1.11
Next
From: Tom Lane
Date:
Subject: Re: [patch] plproxy v2