Re: Review: DTrace probes (merged version) ver_03 - Mailing list pgsql-hackers

From Zdenek Kotala
Subject Re: Review: DTrace probes (merged version) ver_03
Date
Msg-id 48889BB2.8080407@sun.com
Whole thread Raw
In response to Re: Review: DTrace probes (merged version)  (Robert Lor <Robert.Lor@Sun.COM>)
Responses Re: Review: DTrace probes (merged version) ver_03  (Theo Schlossnagle <jesus@omniti.com>)
Re: Review: DTrace probes (merged version) ver_03  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: Review: DTrace probes (merged version) ver_03  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I performed review and I prepared own patch which contains only probes without
any issue. I suggest commit this patch because the rest of patch is independent
and it can be committed next commit fest after rework.

I found following issues:

1) SLRU probes.

I think it is good to have probes there but they needs polish. See my comments
http://reviewdemo.postgresql.org/r/25/


2) XLOG probes

I think there is confuse placement of probes after merge. It needs cleanup.


3) Executor probes

I would like to see any use case for them/

4) smgr probes

I prefer to have this probes in smgr instead of md. The reason why Robert put
them into md is that it returns number of written/read bytes, but it is "always"
BLCKSZ which could be returned from smgr directly. Only difference is
when error occurs during write/read and not all data are written/read.

It needs discuss.

5) autovacuum start probes

I would like to see also stat/stop for any other process types. It was discussed
but no comment from author(s).

6) idle transaction

See my comments
http://reviewdemo.postgresql.org/r/25/

7) query-reewrite is missing


8) mark dirty and BM_HINT... flag

I remove these because I don't see any use case for it. It would be nice provide
some dtrace script or describe basic ideas.



    Thats all Zdenek


Index: src/backend/access/transam/clog.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/clog.c,v
retrieving revision 1.46
diff -c -r1.46 clog.c
*** src/backend/access/transam/clog.c    1 Jan 2008 19:45:46 -0000    1.46
--- src/backend/access/transam/clog.c    24 Jul 2008 14:19:23 -0000
***************
*** 36,41 ****
--- 36,42 ----
  #include "access/slru.h"
  #include "access/transam.h"
  #include "postmaster/bgwriter.h"
+ #include "pg_trace.h"

  /*
   * Defines for CLOG page sizes.  A page is the same BLCKSZ as is used
***************
*** 323,329 ****
--- 324,332 ----
  CheckPointCLOG(void)
  {
      /* Flush dirty CLOG pages to disk */
+     TRACE_POSTGRESQL_CLOG_CHECKPOINT_START();
      SimpleLruFlush(ClogCtl, true);
+     TRACE_POSTGRESQL_CLOG_CHECKPOINT_DONE();
  }


Index: src/backend/access/transam/multixact.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/multixact.c,v
retrieving revision 1.27
diff -c -r1.27 multixact.c
*** src/backend/access/transam/multixact.c    1 Jan 2008 19:45:46 -0000    1.27
--- src/backend/access/transam/multixact.c    24 Jul 2008 14:19:24 -0000
***************
*** 57,62 ****
--- 57,63 ----
  #include "storage/lmgr.h"
  #include "utils/memutils.h"
  #include "storage/procarray.h"
+ #include "pg_trace.h"


  /*
***************
*** 1526,1531 ****
--- 1527,1534 ----
  void
  CheckPointMultiXact(void)
  {
+     TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_START();
+
      /* Flush dirty MultiXact pages to disk */
      SimpleLruFlush(MultiXactOffsetCtl, true);
      SimpleLruFlush(MultiXactMemberCtl, true);
***************
*** 1540,1545 ****
--- 1543,1550 ----
       */
      if (!InRecovery)
          TruncateMultiXact();
+
+     TRACE_POSTGRESQL_MULTIXACT_CHECKPOINT_DONE();
  }

  /*
Index: src/backend/access/transam/subtrans.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/subtrans.c,v
retrieving revision 1.22
diff -c -r1.22 subtrans.c
*** src/backend/access/transam/subtrans.c    26 Mar 2008 18:48:59 -0000    1.22
--- src/backend/access/transam/subtrans.c    24 Jul 2008 14:19:24 -0000
***************
*** 32,37 ****
--- 32,38 ----
  #include "access/subtrans.h"
  #include "access/transam.h"
  #include "utils/snapmgr.h"
+ #include "pg_trace.h"


  /*
***************
*** 281,287 ****
--- 282,290 ----
       * it merely to improve the odds that writing of dirty pages is done by
       * the checkpoint process and not by backends.
       */
+     TRACE_POSTGRESQL_SUBTRANS_CHECKPOINT_START();
      SimpleLruFlush(SubTransCtl, true);
+     TRACE_POSTGRESQL_SUBTRANS_CHECKPOINT_DONE();
  }


Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.43
diff -c -r1.43 twophase.c
*** src/backend/access/transam/twophase.c    19 May 2008 18:16:26 -0000    1.43
--- src/backend/access/transam/twophase.c    24 Jul 2008 14:19:24 -0000
***************
*** 57,62 ****
--- 57,63 ----
  #include "storage/smgr.h"
  #include "utils/builtins.h"
  #include "utils/memutils.h"
+ #include "pg_trace.h"


  /*
***************
*** 1387,1392 ****
--- 1388,1396 ----
       */
      if (max_prepared_xacts <= 0)
          return;                    /* nothing to do */
+
+     TRACE_POSTGRESQL_TWOPHASE_CHECKPOINT_START();
+
      xids = (TransactionId *) palloc(max_prepared_xacts * sizeof(TransactionId));
      nxids = 0;

***************
*** 1444,1449 ****
--- 1448,1455 ----
      }

      pfree(xids);
+
+     TRACE_POSTGRESQL_TWOPHASE_CHECKPOINT_DONE();
  }

  /*
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.176
diff -c -r1.176 pgstat.c
*** src/backend/postmaster/pgstat.c    30 Jun 2008 10:58:47 -0000    1.176
--- src/backend/postmaster/pgstat.c    24 Jul 2008 14:19:30 -0000
***************
*** 61,66 ****
--- 61,67 ----
  #include "utils/ps_status.h"
  #include "utils/rel.h"
  #include "utils/tqual.h"
+ #include "pg_trace.h"


  /* ----------
***************
*** 2202,2207 ****
--- 2203,2210 ----
      TimestampTz start_timestamp;
      int            len;

+     TRACE_POSTGRESQL_STATEMENT_STATUS(cmd_str);
+
      if (!pgstat_track_activities || !beentry)
          return;

Index: src/backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.234
diff -c -r1.234 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c    13 Jul 2008 20:45:47 -0000    1.234
--- src/backend/storage/buffer/bufmgr.c    24 Jul 2008 14:19:31 -0000
***************
*** 43,48 ****
--- 43,49 ----
  #include "utils/rel.h"
  #include "utils/resowner.h"
  #include "pgstat.h"
+ #include "pg_trace.h"


  /* Note: these two macros only work on shared buffers, not local ones! */
***************
*** 213,218 ****
--- 214,222 ----
      if (isExtend)
          blockNum = smgrnblocks(smgr);

+     TRACE_POSTGRESQL_BUFFER_READ_START(blockNum, smgr->smgr_rnode.spcNode,
+         smgr->smgr_rnode.dbNode, smgr->smgr_rnode.relNode, isLocalBuf);
+
      if (isLocalBuf)
      {
          ReadLocalBufferCount++;
***************
*** 240,251 ****
--- 244,262 ----
      {
          if (!isExtend)
          {
+             TRACE_POSTGRESQL_BUFFER_HIT();
+
              /* Just need to update stats before we exit */
              *hit = true;

              if (VacuumCostActive)
                  VacuumCostBalance += VacuumCostPageHit;

+             TRACE_POSTGRESQL_BUFFER_READ_DONE(blockNum,
+                 smgr->smgr_rnode.spcNode,
+                 smgr->smgr_rnode.dbNode,
+                 smgr->smgr_rnode.relNode, isLocalBuf, found);
+
              return BufferDescriptorGetBuffer(bufHdr);
          }

***************
*** 297,302 ****
--- 308,314 ----
              } while (!StartBufferIO(bufHdr, true));
          }
      }
+     TRACE_POSTGRESQL_BUFFER_MISS();

      /*
       * if we have gotten to this point, we have allocated a buffer for the
***************
*** 368,373 ****
--- 380,389 ----
      if (VacuumCostActive)
          VacuumCostBalance += VacuumCostPageMiss;

+     TRACE_POSTGRESQL_BUFFER_READ_DONE(blockNum, smgr->smgr_rnode.spcNode,
+             smgr->smgr_rnode.dbNode, smgr->smgr_rnode.relNode,
+             isLocalBuf, found);
+
      return BufferDescriptorGetBuffer(bufHdr);
  }

***************
*** 1086,1091 ****
--- 1102,1109 ----
      if (num_to_write == 0)
          return;                    /* nothing to do */

+     TRACE_POSTGRESQL_BUFFER_SYNC_START(NBuffers, num_to_write);
+
      /*
       * Loop over all buffers again, and write the ones (still) marked with
       * BM_CHECKPOINT_NEEDED.  In this loop, we start at the clock sweep point
***************
*** 1117,1122 ****
--- 1135,1141 ----
          {
              if (SyncOneBuffer(buf_id, false) & BUF_WRITTEN)
              {
+                 TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id);
                  BgWriterStats.m_buf_written_checkpoints++;
                  num_written++;

***************
*** 1147,1152 ****
--- 1166,1173 ----
              buf_id = 0;
      }

+     TRACE_POSTGRESQL_BUFFER_SYNC_DONE(NBuffers, num_written, num_to_write);
+
      /*
       * Update checkpoint statistics. As noted above, this doesn't include
       * buffers written by other backends or bgwriter scan.
***************
*** 1653,1663 ****
--- 1674,1686 ----
  void
  CheckPointBuffers(int flags)
  {
+     TRACE_POSTGRESQL_BUFFER_CHECKPOINT_START(flags);
      CheckpointStats.ckpt_write_t = GetCurrentTimestamp();
      BufferSync(flags);
      CheckpointStats.ckpt_sync_t = GetCurrentTimestamp();
      smgrsync();
      CheckpointStats.ckpt_sync_end_t = GetCurrentTimestamp();
+     TRACE_POSTGRESQL_BUFFER_CHECKPOINT_DONE();
  }


***************
*** 1759,1764 ****
--- 1782,1791 ----
      if (reln == NULL)
          reln = smgropen(buf->tag.rnode);

+     TRACE_POSTGRESQL_BUFFER_FLUSH_START(reln->smgr_rnode.spcNode,
+          reln->smgr_rnode.dbNode,
+          reln->smgr_rnode.relNode, buf->tag.blockNum);
+
      /*
       * Force XLOG flush up to buffer's LSN.  This implements the basic WAL
       * rule that log updates must hit disk before any of the data-file changes
***************
*** 1785,1790 ****
--- 1812,1821 ----

      BufferFlushCount++;

+     TRACE_POSTGRESQL_BUFFER_FLUSH_DONE(reln->smgr_rnode.spcNode,
+          reln->smgr_rnode.dbNode, reln->smgr_rnode.relNode,
+          buf->tag.blockNum);
+
      /*
       * Mark the buffer as clean (unless BM_JUST_DIRTIED has become set) and
       * end the io_in_progress state.
Index: src/backend/storage/lmgr/deadlock.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/lmgr/deadlock.c,v
retrieving revision 1.53
diff -c -r1.53 deadlock.c
*** src/backend/storage/lmgr/deadlock.c    24 Mar 2008 18:22:36 -0000    1.53
--- src/backend/storage/lmgr/deadlock.c    24 Jul 2008 14:19:31 -0000
***************
*** 30,35 ****
--- 30,36 ----
  #include "storage/lmgr.h"
  #include "storage/proc.h"
  #include "utils/memutils.h"
+ #include "pg_trace.h"


  /* One edge in the waits-for graph */
***************
*** 222,227 ****
--- 223,230 ----
           */
          int            nSoftEdges;

+         TRACE_POSTGRESQL_DEADLOCK_FOUND();
+
          nWaitOrders = 0;
          if (!FindLockCycle(proc, possibleConstraints, &nSoftEdges))
              elog(FATAL, "deadlock seems to have disappeared");
***************
*** 259,264 ****
--- 262,269 ----
          ProcLockWakeup(GetLocksMethodTable(lock), lock);
      }

+     TRACE_POSTGRESQL_DEADLOCK_NOTFOUND(nWaitOrders);
+
      /* Return code tells caller if we had to escape a deadlock or not */
      if (nWaitOrders > 0)
          return DS_SOFT_DEADLOCK;
Index: src/backend/tcop/postgres.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/postgres.c,v
retrieving revision 1.554
diff -c -r1.554 postgres.c
*** src/backend/tcop/postgres.c    18 Jul 2008 20:26:06 -0000    1.554
--- src/backend/tcop/postgres.c    24 Jul 2008 14:19:31 -0000
***************
*** 71,76 ****
--- 71,77 ----
  #include "mb/pg_wchar.h"

  #include "pgstat.h"
+ #include "pg_trace.h"

  extern int    optind;
  extern char *optarg;
***************
*** 551,556 ****
--- 552,559 ----
  {
      List       *raw_parsetree_list;

+     TRACE_POSTGRESQL_QUERY_PARSE_START(query_string);
+
      if (log_parser_stats)
          ResetUsage();

***************
*** 660,665 ****
--- 663,670 ----
          elog_node_display(DEBUG1, "rewritten parse tree", querytree_list,
                            Debug_pretty_print);

+     TRACE_POSTGRESQL_QUERY_PARSE_DONE();
+
      return querytree_list;
  }

***************
*** 673,678 ****
--- 678,685 ----
  {
      PlannedStmt *plan;

+     TRACE_POSTGRESQL_QUERY_PLAN_START();
+
      /* Utility commands have no plans. */
      if (querytree->commandType == CMD_UTILITY)
          return NULL;
***************
*** 710,715 ****
--- 717,723 ----
       */
      if (Debug_print_plan)
          elog_node_display(DEBUG1, "plan", plan, Debug_pretty_print);
+     TRACE_POSTGRESQL_QUERY_PLAN_DONE();

      return plan;
  }
***************
*** 785,790 ****
--- 793,799 ----
      bool        isTopLevel;
      char        msec_str[32];

+
      /*
       * Report query to various monitoring facilities.
       */
***************
*** 792,797 ****
--- 801,808 ----

      pgstat_report_activity(query_string);

+     TRACE_POSTGRESQL_QUERY_START(query_string);
+
      /*
       * We use save_log_statement_stats so ShowUsage doesn't report incorrect
       * results because ResetUsage wasn't called.
***************
*** 1058,1063 ****
--- 1069,1076 ----
      if (save_log_statement_stats)
          ShowUsage("QUERY STATISTICS");

+     TRACE_POSTGRESQL_QUERY_DONE(query_string);
+
      debug_query_string = NULL;
  }

Index: src/backend/tcop/pquery.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/pquery.c,v
retrieving revision 1.123
diff -c -r1.123 pquery.c
*** src/backend/tcop/pquery.c    12 May 2008 20:02:02 -0000    1.123
--- src/backend/tcop/pquery.c    24 Jul 2008 14:19:31 -0000
***************
*** 24,29 ****
--- 24,30 ----
  #include "tcop/utility.h"
  #include "utils/memutils.h"
  #include "utils/snapmgr.h"
+ #include "pg_trace.h"


  /*
***************
*** 711,716 ****
--- 712,719 ----

      AssertArg(PortalIsValid(portal));

+     TRACE_POSTGRESQL_QUERY_EXECUTE_START();
+
      /* Initialize completion tag to empty string */
      if (completionTag)
          completionTag[0] = '\0';
***************
*** 857,862 ****
--- 860,867 ----

      if (log_executor_stats && portal->strategy != PORTAL_MULTI_QUERY)
          ShowUsage("EXECUTOR STATISTICS");
+
+     TRACE_POSTGRESQL_QUERY_EXECUTE_DONE();

      return result;
  }
***************
*** 1237,1242 ****
--- 1242,1249 ----
               */
              PlannedStmt *pstmt = (PlannedStmt *) stmt;

+             TRACE_POSTGRESQL_QUERY_EXECUTE_START();
+
              if (log_executor_stats)
                  ResetUsage();

***************
*** 1257,1262 ****
--- 1264,1271 ----

              if (log_executor_stats)
                  ShowUsage("EXECUTOR STATISTICS");
+
+             TRACE_POSTGRESQL_QUERY_EXECUTE_DONE();
          }
          else
          {
Index: src/backend/utils/probes.d
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/probes.d,v
retrieving revision 1.2
diff -c -r1.2 probes.d
*** src/backend/utils/probes.d    2 Jan 2008 02:42:06 -0000    1.2
--- src/backend/utils/probes.d    24 Jul 2008 14:19:31 -0000
***************
*** 7,24 ****
   * ----------
   */

  provider postgresql {

! probe transaction__start(int);
! probe transaction__commit(int);
! probe transaction__abort(int);
! probe lwlock__acquire(int, int);
! probe lwlock__release(int);
! probe lwlock__startwait(int, int);
! probe lwlock__endwait(int, int);
! probe lwlock__condacquire(int, int);
! probe lwlock__condacquire__fail(int, int);
! probe lock__startwait(int, int);
! probe lock__endwait(int, int);

  };
--- 7,76 ----
   * ----------
   */

  provider postgresql {

!     /*
!      * Due to a bug in Mac OS X 10.5, using defined types (e.g. uintptr_t,
!      * uint32_t, etc.) cause compilation error.
!      */
!
!     probe transaction__start(unsigned int transactionId);
!     probe transaction__commit(unsigned int transactionId);
!     probe transaction__abort(unsigned int transactionId);
!
!     probe lwlock__acquire(unsigned int lockId, unsigned int mode);
!     probe lwlock__release(unsigned int lockId);
!     probe lwlock__startwait(unsigned int lockId, unsigned int mode);
!     probe lwlock__endwait(unsigned int lockId, unsigned int mode);
!     probe lwlock__condacquire(unsigned int lockId, unsigned int mode);
!     probe lwlock__condacquire__fail(unsigned int lockId, unsigned int mode);
!     probe lock__startwait(unsigned int lockTagField2, int lockMode);
!     probe lock__endwait(unsigned int lockTagField2, int lockMode);
!
!     probe query__parse__start(char *queryString);
!     probe query__parse__done();
!     probe query__plan__start();
!     probe query__plan__done();
!     probe query__execute__start();
!     probe query__execute__done();
!     probe query__start(char *queryString);
!     probe query__done(char *queryString);
!     probe statement__status(char *cmdString);
!
!     probe sort__start(int, int, int, int, int);
!     probe sort__end(int, long);
!
!     probe buffer__read__start(int blockNum, int tablespaceOid,
!             int databaseOid, int relationOid, int isLocalBuf);
!     probe buffer__read__done(int blockNum, int tablespaceOid,
!              int databaseOid, int relationOid, int isLocalBuf,
!             int isInBuffPool);
!     probe buffer__write__dirty__start(int blockNum, int tablespaceOid,
!             int databaseOid, int relationOid);
!     probe buffer__write__dirty__done(int blockNum, int tablespaceOid,
!             int databaseOid, int relationOid, int isHintBitsWrite);
!
!     probe buffer__flush__start(int tablespaceOid,
!             int databaseOid, int relationOid, int blockNum);
!     probe buffer__flush__done(int tablespaceOid, int databaseOid,
!             int relationOid, int blockNum);
!     probe buffer__hit();
!     probe buffer__miss();
!     probe buffer__checkpoint__start(int);
!     probe buffer__checkpoint__done();
!     probe buffer__sync__start(int numBuffers, int numToWrite);
!     probe buffer__sync__written(int bufId);
!     probe buffer__sync__done(int numBuffers, int numWritten, int numToWrite);
!
!     probe deadlock__found();
!     probe deadlock__notfound(int);

+     probe clog__checkpoint__start();
+     probe clog__checkpoint__done();
+     probe subtrans__checkpoint__start();
+     probe subtrans__checkpoint__done();
+     probe multixact__checkpoint__start();
+     probe multixact__checkpoint__done();
+     probe twophase__checkpoint__start();
+     probe twophase__checkpoint__done();
  };
Index: src/backend/utils/sort/tuplesort.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/sort/tuplesort.c,v
retrieving revision 1.85
diff -c -r1.85 tuplesort.c
*** src/backend/utils/sort/tuplesort.c    19 Jun 2008 00:46:05 -0000    1.85
--- src/backend/utils/sort/tuplesort.c    24 Jul 2008 14:19:32 -0000
***************
*** 115,126 ****
--- 115,132 ----
  #include "utils/rel.h"
  #include "utils/syscache.h"
  #include "utils/tuplesort.h"
+ #include "pg_trace.h"


  /* GUC variables */
  #ifdef TRACE_SORT
  bool        trace_sort = false;
  #endif
+
+ #define HEAP_SORT    0
+ #define INDEX_SORT    1
+ #define DATUM_SORT    2
+
  #ifdef DEBUG_BOUNDED_SORT
  bool        optimize_bounded_sort = true;
  #endif
***************
*** 569,574 ****
--- 575,581 ----
               "begin tuple sort: nkeys = %d, workMem = %d, randomAccess = %c",
               nkeys, workMem, randomAccess ? 't' : 'f');
  #endif
+     TRACE_POSTGRESQL_SORT_START(HEAP_SORT, false, nkeys, workMem, randomAccess);

      state->nKeys = nkeys;

***************
*** 636,641 ****
--- 643,650 ----

      state->nKeys = RelationGetNumberOfAttributes(indexRel);

+     TRACE_POSTGRESQL_SORT_START(INDEX_SORT, enforceUnique, state->nKeys, workMem, randomAccess);
+
      state->comparetup = comparetup_index_btree;
      state->copytup = copytup_index;
      state->writetup = writetup_index;
***************
*** 713,718 ****
--- 722,728 ----
               "begin datum sort: workMem = %d, randomAccess = %c",
               workMem, randomAccess ? 't' : 'f');
  #endif
+     TRACE_POSTGRESQL_SORT_START(DATUM_SORT, false, 1, workMem, randomAccess);

      state->nKeys = 1;            /* always a one-column sort */

***************
*** 825,830 ****
--- 835,845 ----
      }
  #endif

+     TRACE_POSTGRESQL_SORT_END(state->tapeset,
+             (state->tapeset ? LogicalTapeSetBlocks(state->tapeset) :
+             (state->allowedMem - state->availMem + 1023) / 1024));
+
+
      MemoryContextSwitchTo(oldcontext);

      /*

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Uncopied parameters on CREATE TABLE LIKE
Next
From: Hannu Krosing
Date:
Subject: Re: Do we really want to migrate plproxy and citext into PG core distribution?