Thread: lock timeout patch

lock timeout patch

From
Satoshi Nagayasu
Date:
Hello all,

I've created a lock timeout patch and it's attached.

When a transaction is blocked by another transaction because of
waiting a lock, we need a lock timeout in some cases.

Using this patch, the lock timeout is enabled with
'lock_timeout = xxxx' directive in postgresql.conf,
and if a timeout is occured, an error code (40P02)
will be returned and a client application can detect it using

  JDBC: SQLException.getSQLState()
  C:    PQresultErrorField()

I know my code need to be cleaned up,
but any comments about this patch?

--
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>

Index: backend/postmaster/postmaster.c
===================================================================
RCS file: /home/snaga/cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 postmaster.c
*** backend/postmaster/postmaster.c    10 Jun 2004 00:22:29 -0000    1.1.1.1
--- backend/postmaster/postmaster.c    18 Jun 2004 03:17:22 -0000
***************
*** 2418,2424 ****
       * after a time delay, so that a broken client can't hog a connection
       * indefinitely.  PreAuthDelay doesn't count against the time limit.
       */
!     if (!enable_sig_alarm(AuthenticationTimeout * 1000, false))
          elog(FATAL, "could not set timer for authorization timeout");

      /*
--- 2418,2424 ----
       * after a time delay, so that a broken client can't hog a connection
       * indefinitely.  PreAuthDelay doesn't count against the time limit.
       */
!     if (!enable_sig_alarm(AuthenticationTimeout * 1000, false, false))
          elog(FATAL, "could not set timer for authorization timeout");

      /*
***************
*** 2447,2453 ****
       * Done with authentication.  Disable timeout, and prevent
       * SIGTERM/SIGQUIT again until backend startup is complete.
       */
!     if (!disable_sig_alarm(false))
          elog(FATAL, "could not disable timer for authorization timeout");
      PG_SETMASK(&BlockSig);

--- 2447,2453 ----
       * Done with authentication.  Disable timeout, and prevent
       * SIGTERM/SIGQUIT again until backend startup is complete.
       */
!     if (!disable_sig_alarm(false, false))
          elog(FATAL, "could not disable timer for authorization timeout");
      PG_SETMASK(&BlockSig);

Index: backend/storage/lmgr/proc.c
===================================================================
RCS file: /home/snaga/cvsroot/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 proc.c
*** backend/storage/lmgr/proc.c    10 Jun 2004 00:22:29 -0000    1.1.1.1
--- backend/storage/lmgr/proc.c    28 Jun 2004 03:20:10 -0000
***************
*** 52,60 ****
--- 52,68 ----
  #include "storage/sinval.h"
  #include "storage/spin.h"

+ #ifdef LOCKTIMEOUT_DEBUG
+ #define PRINT_TIME(MSG, X) \
+         elog(NOTICE, "%s: %d.%03d", MSG, (X).tv_sec, ((X).tv_usec/1000))
+ #else
+ #define PRINT_TIME(MSG, X) {}
+ #endif
+
  /* GUC variables */
  int            DeadlockTimeout = 1000;
  int            StatementTimeout = 0;
+ int            LockTimeout     = 0;

  /* Pointer to this process's PGPROC struct, if any */
  PGPROC       *MyProc = NULL;
***************
*** 78,92 ****
  /* Mark these volatile because they can be changed by signal handler */
  static volatile bool statement_timeout_active = false;
  static volatile bool deadlock_timeout_active = false;

  /* statement_fin_time is valid only if statement_timeout_active is true */
  static struct timeval statement_fin_time;
!

  static void ProcKill(void);
  static void DummyProcKill(void);
  static bool CheckStatementTimeout(void);


  /*
   * Report number of semaphores needed by InitProcGlobal.
--- 86,102 ----
  /* Mark these volatile because they can be changed by signal handler */
  static volatile bool statement_timeout_active = false;
  static volatile bool deadlock_timeout_active = false;
+ static volatile bool lock_timeout_active = false;

  /* statement_fin_time is valid only if statement_timeout_active is true */
  static struct timeval statement_fin_time;
! static struct timeval lock_fin_time;

  static void ProcKill(void);
  static void DummyProcKill(void);
  static bool CheckStatementTimeout(void);

+ static LOCK *prevWaitLock = NULL;

  /*
   * Report number of semaphores needed by InitProcGlobal.
***************
*** 244,249 ****
--- 254,261 ----
      MyProc->waitHolder = NULL;
      SHMQueueInit(&(MyProc->procHolders));

+     prevWaitLock = NULL;
+
      /*
       * Arrange to clean up at backend exit.
       */
***************
*** 307,312 ****
--- 319,326 ----
      MyProc->waitHolder = NULL;
      SHMQueueInit(&(MyProc->procHolders));

+     prevWaitLock = NULL;
+
      /*
       * Arrange to clean up at process exit.
       */
***************
*** 338,344 ****
      waitingForLock = false;

      /* Turn off the deadlock timer, if it's still running (see ProcSleep) */
!     disable_sig_alarm(false);

      /* Unlink myself from the wait queue, if on it (might not be anymore!) */
      LWLockAcquire(LockMgrLock, LW_EXCLUSIVE);
--- 352,358 ----
      waitingForLock = false;

      /* Turn off the deadlock timer, if it's still running (see ProcSleep) */
!     disable_sig_alarm(false, false);

      /* Unlink myself from the wait queue, if on it (might not be anymore!) */
      LWLockAcquire(LockMgrLock, LW_EXCLUSIVE);
***************
*** 627,632 ****
--- 641,648 ----

      MyProc->errType = STATUS_OK;    /* initialize result for success */

+     prevWaitLock = MyProc->waitLock;
+
      /*
       * If we detected deadlock, give up without waiting.  This must agree
       * with CheckDeadLock's recovery code, except that we shouldn't
***************
*** 661,667 ****
       * By delaying the check until we've waited for a bit, we can avoid
       * running the rather expensive deadlock-check code in most cases.
       */
!     if (!enable_sig_alarm(DeadlockTimeout, false))
          elog(FATAL, "could not set timer for process wakeup");

      /*
--- 677,683 ----
       * By delaying the check until we've waited for a bit, we can avoid
       * running the rather expensive deadlock-check code in most cases.
       */
!     if (!enable_sig_alarm(DeadlockTimeout, false, false))
          elog(FATAL, "could not set timer for process wakeup");

      /*
***************
*** 683,689 ****
      /*
       * Disable the timer, if it's still running
       */
!     if (!disable_sig_alarm(false))
          elog(FATAL, "could not disable timer for process wakeup");

      /*
--- 699,705 ----
      /*
       * Disable the timer, if it's still running
       */
!     if (!disable_sig_alarm(false, false))
          elog(FATAL, "could not disable timer for process wakeup");

      /*
***************
*** 738,743 ****
--- 754,761 ----
      proc->waitHolder = NULL;
      proc->errType = errType;

+     prevWaitLock = NULL;
+
      /* And awaken it */
      PGSemaphoreUnlock(&proc->sem);

***************
*** 885,890 ****
--- 903,911 ----
       * RemoveFromWaitQueue took care of waking up any such processes.
       */
      LWLockRelease(LockMgrLock);
+
+     deadlock_timeout_active = false;
+     prevWaitLock = NULL;
  }


***************
*** 930,936 ****
          PGSemaphoreUnlock(&proc->sem);
  }

-
  /*****************************************************************************
   * SIGALRM interrupt support
   *
--- 951,956 ----
***************
*** 949,955 ****
   * Returns TRUE if okay, FALSE on failure.
   */
  bool
! enable_sig_alarm(int delayms, bool is_statement_timeout)
  {
  #ifdef WIN32
  #warning add Win32 timer
--- 969,975 ----
   * Returns TRUE if okay, FALSE on failure.
   */
  bool
! enable_sig_alarm(int delayms, bool is_statement_timeout, bool is_lock_timeout)
  {
  #ifdef WIN32
  #warning add Win32 timer
***************
*** 963,970 ****
      bigtime_t    time_interval;
  #endif

      /* Compute target timeout time if we will need it */
!     if (is_statement_timeout || statement_timeout_active)
      {
          gettimeofday(&fin_time, NULL);
          fin_time.tv_sec += delayms / 1000;
--- 983,999 ----
      bigtime_t    time_interval;
  #endif

+ #ifdef LOCKTIMEOUT_DEBUG
+     elog(NOTICE, "enable_sig_alarm: is_statement_timeout=%d, "
+          "is_lock_timeout=%d\n",
+          is_statement_timeout,
+          is_lock_timeout);
+     elog(NOTICE, "enable_sig_alarm: LockTimeout=%d", LockTimeout);
+ #endif
+
      /* Compute target timeout time if we will need it */
!     if (is_statement_timeout || statement_timeout_active ||
!         is_lock_timeout || lock_timeout_active )
      {
          gettimeofday(&fin_time, NULL);
          fin_time.tv_sec += delayms / 1000;
***************
*** 974,979 ****
--- 1003,1009 ----
              fin_time.tv_sec++;
              fin_time.tv_usec -= 1000000;
          }
+         PRINT_TIME("enable_sig_alarm", fin_time);
      }

      if (is_statement_timeout)
***************
*** 983,1012 ****
          statement_fin_time = fin_time;
          statement_timeout_active = true;
      }
!     else if (statement_timeout_active)
      {
!         /*
!          * Begin deadlock timeout with statement-level timeout active
!          *
!          * Here, we want to interrupt at the closer of the two timeout times.
!          * If fin_time >= statement_fin_time then we need not touch the
!          * existing timer setting; else set up to interrupt at the
!          * deadlock timeout time.
!          *
!          * NOTE: in this case it is possible that this routine will be
!          * interrupted by the previously-set timer alarm.  This is okay
!          * because the signal handler will do only what it should do
!          * according to the state variables.  The deadlock checker may get
!          * run earlier than normal, but that does no harm.
!          */
!         deadlock_timeout_active = true;
!         if (fin_time.tv_sec > statement_fin_time.tv_sec ||
!             (fin_time.tv_sec == statement_fin_time.tv_sec &&
!              fin_time.tv_usec >= statement_fin_time.tv_usec))
!             return true;
      }
      else
      {
          /* Begin deadlock timeout with no statement-level timeout */
          deadlock_timeout_active = true;
      }
--- 1013,1066 ----
          statement_fin_time = fin_time;
          statement_timeout_active = true;
      }
!     else if ( is_lock_timeout )
      {
!         Assert(!lock_timeout_active);
!         lock_fin_time = fin_time;
!         lock_timeout_active = true;
!         prevWaitLock = MyProc->waitLock;
      }
      else
      {
+         /*
+          * If is_statement_timeout and is_lock_timeout are false,
+          * delayms means a timeout time to detect deadlock.
+          */
+         if (statement_timeout_active)
+         {
+             /*
+              * Begin deadlock timeout with statement-level timeout active
+              *
+              * Here, we want to interrupt at the closer of the two timeout times.
+              * If fin_time >= statement_fin_time then we need not touch the
+              * existing timer setting; else set up to interrupt at the
+              * deadlock timeout time.
+              *
+              * NOTE: in this case it is possible that this routine will be
+              * interrupted by the previously-set timer alarm.  This is okay
+              * because the signal handler will do only what it should do
+              * according to the state variables.  The deadlock checker may get
+              * run earlier than normal, but that does no harm.
+              */
+             deadlock_timeout_active = true;
+             if (fin_time.tv_sec > statement_fin_time.tv_sec ||
+                 (fin_time.tv_sec == statement_fin_time.tv_sec &&
+                  fin_time.tv_usec >= statement_fin_time.tv_usec))
+                 return true;
+         }
+         if (lock_timeout_active)
+         {
+             /*
+              * If the lock timeout has been already set, and it is shorter
+              * than me, there is no need to set another one.
+              * The shortest timer has to be set.
+              */
+             deadlock_timeout_active = true;
+             if (fin_time.tv_sec > lock_fin_time.tv_sec ||
+                 (fin_time.tv_sec == lock_fin_time.tv_sec &&
+                  fin_time.tv_usec >= lock_fin_time.tv_usec))
+                 return true;
+         }
          /* Begin deadlock timeout with no statement-level timeout */
          deadlock_timeout_active = true;
      }
***************
*** 1016,1021 ****
--- 1070,1086 ----
      MemSet(&timeval, 0, sizeof(struct itimerval));
      timeval.it_value.tv_sec = delayms / 1000;
      timeval.it_value.tv_usec = (delayms % 1000) * 1000;
+     {
+         PRINT_TIME("enable_sig_alarm", timeval.it_value);
+ #ifdef LOCKTIMEOUT_DEBUG
+         elog(NOTICE, "enable_sig_alarm: deadlock_timeout_active=%d, "
+              "statement_timeout_active=%d, "
+              "lock_timeout_active=%d\n",
+              deadlock_timeout_active,
+              statement_timeout_active,
+              lock_timeout_active);
+ #endif
+     }
      if (setitimer(ITIMER_REAL, &timeval, NULL))
          return false;
  #else
***************
*** 1025,1030 ****
--- 1090,1098 ----
          return false;
  #endif
  #endif
+
+     PRINT_TIME("enable_sig_alarm: lock_fin_time", lock_fin_time);
+
      return true;
  }

***************
*** 1036,1042 ****
   * Returns TRUE if okay, FALSE on failure.
   */
  bool
! disable_sig_alarm(bool is_statement_timeout)
  {
  #ifdef WIN32
  #warning add Win32 timer
--- 1104,1110 ----
   * Returns TRUE if okay, FALSE on failure.
   */
  bool
! disable_sig_alarm(bool is_statement_timeout, bool is_lock_timeout)
  {
  #ifdef WIN32
  #warning add Win32 timer
***************
*** 1049,1055 ****
       *
       * We will re-enable the interrupt if necessary in CheckStatementTimeout.
       */
!     if (statement_timeout_active || deadlock_timeout_active)
      {
  #ifndef __BEOS__
          struct itimerval timeval;
--- 1117,1124 ----
       *
       * We will re-enable the interrupt if necessary in CheckStatementTimeout.
       */
!     if (statement_timeout_active || deadlock_timeout_active ||
!         lock_timeout_active )
      {
  #ifndef __BEOS__
          struct itimerval timeval;
***************
*** 1057,1070 ****
          MemSet(&timeval, 0, sizeof(struct itimerval));
          if (setitimer(ITIMER_REAL, &timeval, NULL))
          {
!             statement_timeout_active = deadlock_timeout_active = false;
              return false;
          }
  #else
          /* BeOS doesn't have setitimer, but has set_alarm */
          if (set_alarm(B_INFINITE_TIMEOUT, B_PERIODIC_ALARM) < 0)
          {
!             statement_timeout_active = deadlock_timeout_active = false;
              return false;
          }
  #endif
--- 1126,1143 ----
          MemSet(&timeval, 0, sizeof(struct itimerval));
          if (setitimer(ITIMER_REAL, &timeval, NULL))
          {
!             statement_timeout_active = deadlock_timeout_active
!                     = lock_timeout_active = false;
!             prevWaitLock = NULL;
              return false;
          }
  #else
          /* BeOS doesn't have setitimer, but has set_alarm */
          if (set_alarm(B_INFINITE_TIMEOUT, B_PERIODIC_ALARM) < 0)
          {
!             statement_timeout_active = deadlock_timeout_active
!                     = lock_timeout_active = false;
!             prevWaitLock = NULL;
              return false;
          }
  #endif
***************
*** 1076,1081 ****
--- 1149,1159 ----
      /* Cancel or reschedule statement timeout */
      if (is_statement_timeout)
          statement_timeout_active = false;
+     else if (is_lock_timeout)
+     {
+         lock_timeout_active = false;
+         prevWaitLock = NULL;
+     }
      else if (statement_timeout_active)
      {
          if (!CheckStatementTimeout())
***************
*** 1086,1091 ****
--- 1164,1196 ----
  }


+ static struct timeval *
+ GetShorterTimer(struct timeval *r, struct timeval t1, struct timeval t2)
+ {
+     if ( t1.tv_sec<t2.tv_sec )
+     {
+         r->tv_sec = t1.tv_sec;
+         r->tv_usec = t1.tv_usec;
+     }
+     else if ( t1.tv_sec>t2.tv_sec )
+     {
+         r->tv_sec = t2.tv_sec;
+         r->tv_usec = t2.tv_usec;
+     }
+     else if ( t1.tv_usec<t2.tv_usec )
+     {
+         r->tv_sec = t1.tv_sec;
+         r->tv_usec = t1.tv_usec;
+     }
+     else
+     {
+         r->tv_sec = t2.tv_sec;
+         r->tv_usec = t2.tv_usec;
+     }
+     return r;
+ }
+
+
  /*
   * Check for statement timeout.  If the timeout time has come,
   * trigger a query-cancel interrupt; if not, reschedule the SIGALRM
***************
*** 1121,1128 ****
          struct itimerval timeval;

          MemSet(&timeval, 0, sizeof(struct itimerval));
!         timeval.it_value.tv_sec = statement_fin_time.tv_sec - now.tv_sec;
!         timeval.it_value.tv_usec = statement_fin_time.tv_usec - now.tv_usec;
          if (timeval.it_value.tv_usec < 0)
          {
              timeval.it_value.tv_sec--;
--- 1226,1242 ----
          struct itimerval timeval;

          MemSet(&timeval, 0, sizeof(struct itimerval));
!
!         if ( statement_timeout_active && lock_timeout_active )
!             GetShorterTimer(&(timeval.it_value), statement_fin_time,lock_fin_time);
!         else if ( statement_timeout_active )
!             timeval.it_value = statement_fin_time;
!         else if ( lock_timeout_active )
!             timeval.it_value = lock_fin_time;
!
!         timeval.it_value.tv_sec -= now.tv_sec;
!         timeval.it_value.tv_usec -= now.tv_usec;
!
          if (timeval.it_value.tv_usec < 0)
          {
              timeval.it_value.tv_sec--;
***************
*** 1131,1136 ****
--- 1245,1251 ----
          if (setitimer(ITIMER_REAL, &timeval, NULL))
              return false;
  #else
+ #warning *** LOCKTIMEOUT NOT IMPLEMENTED!!! ***
          /* BeOS doesn't have setitimer, but has set_alarm */
          bigtime_t    time_interval;

***************
*** 1146,1151 ****
--- 1261,1346 ----
      return true;
  }

+ static bool
+ CheckLockTimeout(void)
+ {
+     struct timeval now;
+
+     if (!lock_timeout_active)
+         return true;            /* do nothing if not active */
+
+     gettimeofday(&now, NULL);
+
+     PRINT_TIME("CheckLockTimeout: now", now);
+     PRINT_TIME("CheckLockTimeout: lock_fin_time", lock_fin_time);
+
+     if (now.tv_sec > lock_fin_time.tv_sec ||
+         (now.tv_sec == lock_fin_time.tv_sec &&
+          now.tv_usec >= lock_fin_time.tv_usec))
+
+     elog(DEBUG1, "MyProc.lwWaiting=%d, MyProc.lwExclusive=%d", MyProc->lwWaiting,
+          MyProc->lwWaiting);
+     elog(DEBUG1, "MyProc.waitLock=%p", MyProc->waitLock);
+
+     /*
+      * Still waiting a same lock.
+      */
+     if ( MyProc->waitLock && MyProc->waitLock==prevWaitLock )
+     {
+         /* Time to die */
+         lock_timeout_active = false;
+         prevWaitLock = NULL;
+
+         ereport(ERROR, (errcode(ERRCODE_T_R_LOCKTIMEOUT_DETECTED),
+                 errmsg("the current transaction is going to be rolled-back because of lock timeout.")));
+     }
+     else
+     {
+         /* Not time yet, so (re)schedule the interrupt */
+ #ifdef WIN32
+ #warning add win32 timer
+ #else
+ #ifndef __BEOS__
+         struct itimerval timeval;
+
+         MemSet(&timeval, 0, sizeof(struct itimerval));
+
+         if ( statement_timeout_active && lock_timeout_active )
+             GetShorterTimer(&(timeval.it_value), statement_fin_time,lock_fin_time);
+         else if ( statement_timeout_active )
+             timeval.it_value = statement_fin_time;
+         else if ( lock_timeout_active )
+             timeval.it_value = lock_fin_time;
+
+         PRINT_TIME("CheckLockTimeout: timeval.it_value", timeval.it_value);
+         timeval.it_value.tv_sec -= now.tv_sec;
+         timeval.it_value.tv_usec -= now.tv_usec;
+         PRINT_TIME("CheckLockTimeout: timeval.it_value", timeval.it_value);
+
+         if (timeval.it_value.tv_usec < 0)
+         {
+             timeval.it_value.tv_sec--;
+             timeval.it_value.tv_usec += 1000000;
+         }
+         if (setitimer(ITIMER_REAL, &timeval, NULL))
+             return false;
+ #else
+ #warning *** NOT IMPLEMENTED!!! ***
+         /* BeOS doesn't have setitimer, but has set_alarm */
+         bigtime_t    time_interval;
+
+         time_interval =
+             (lock_fin_time.tv_sec - now.tv_sec) * 1000000 +
+             (lock_fin_time.tv_usec - now.tv_usec);
+         if (set_alarm(time_interval, B_ONE_SHOT_RELATIVE_ALARM) < 0)
+             return false;
+ #endif
+ #endif
+     }
+
+     return true;
+ }
+

  /*
   * Signal handler for SIGALRM
***************
*** 1160,1173 ****
  {
      int            save_errno = errno;

      if (deadlock_timeout_active)
      {
-         deadlock_timeout_active = false;
          CheckDeadLock();
      }

      if (statement_timeout_active)
          (void) CheckStatementTimeout();

      errno = save_errno;
  }
--- 1355,1382 ----
  {
      int            save_errno = errno;

+ //    elog(NOTICE, "handle_sig_alarm: deadlock_timeout_active=%d, statement_timeout_active=%d,
lock_timeout_active=%d",deadlock_timeout_active, statement_timeout_active, lock_timeout_active); 
+
+     PRINT_TIME("handle_sig_alarm: lock_fin_time", lock_fin_time);
+
      if (deadlock_timeout_active)
      {
          CheckDeadLock();
      }

+     PRINT_TIME("handle_sig_alarm: lock_fin_time", lock_fin_time);
+
      if (statement_timeout_active)
+     {
          (void) CheckStatementTimeout();
+     }
+
+     PRINT_TIME("handle_sig_alarm: lock_fin_time", lock_fin_time);
+
+     if (lock_timeout_active)
+     {
+         (void) CheckLockTimeout();
+     }

      errno = save_errno;
  }
Index: backend/tcop/postgres.c
===================================================================
RCS file: /home/snaga/cvsroot/pgsql/src/backend/tcop/postgres.c,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 postgres.c
*** backend/tcop/postgres.c    10 Jun 2004 00:22:29 -0000    1.1.1.1
--- backend/tcop/postgres.c    18 Jun 2004 03:16:32 -0000
***************
*** 1734,1740 ****

          /* Set statement timeout running, if any */
          if (StatementTimeout > 0)
!             enable_sig_alarm(StatementTimeout, true);

          xact_started = true;
      }
--- 1734,1742 ----

          /* Set statement timeout running, if any */
          if (StatementTimeout > 0)
!             enable_sig_alarm(StatementTimeout, true, false);
!         if (LockTimeout > 0)
!             enable_sig_alarm(LockTimeout, false, true);

          xact_started = true;
      }
***************
*** 1749,1755 ****
          DeferredTriggerEndQuery();

          /* Cancel any active statement timeout before committing */
!         disable_sig_alarm(true);

          /* Now commit the command */
          ereport(DEBUG3,
--- 1751,1757 ----
          DeferredTriggerEndQuery();

          /* Cancel any active statement timeout before committing */
!         disable_sig_alarm(true, true);

          /* Now commit the command */
          ereport(DEBUG3,
***************
*** 2704,2710 ****
          QueryCancelPending = false;
          InterruptHoldoffCount = 1;
          CritSectionCount = 0;    /* should be unnecessary, but... */
!         disable_sig_alarm(true);
          QueryCancelPending = false;        /* again in case timeout occurred */
          DisableNotifyInterrupt();
          debug_query_string = NULL;
--- 2706,2712 ----
          QueryCancelPending = false;
          InterruptHoldoffCount = 1;
          CritSectionCount = 0;    /* should be unnecessary, but... */
!         disable_sig_alarm(true, true);
          QueryCancelPending = false;        /* again in case timeout occurred */
          DisableNotifyInterrupt();
          debug_query_string = NULL;
Index: backend/utils/misc/guc.c
===================================================================
RCS file: /home/snaga/cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 guc.c
*** backend/utils/misc/guc.c    10 Jun 2004 00:22:33 -0000    1.1.1.1
--- backend/utils/misc/guc.c    28 Jun 2004 01:30:20 -0000
***************
*** 910,917 ****
              NULL
          },
          &DeadlockTimeout,
!         1000, 0, INT_MAX, NULL, NULL
      },

  #ifdef HAVE_SYSLOG
      {
--- 910,927 ----
              NULL
          },
          &DeadlockTimeout,
!         5000, 0, INT_MAX, NULL, NULL
      },
+
+     {
+         {"lock_timeout", PGC_SIGHUP, LOCK_MANAGEMENT,
+             gettext_noop("The time in milliseconds to wait on lock before detecting timeout."),
+             NULL
+         },
+         &LockTimeout,
+         0, 0, INT_MAX, NULL, NULL
+     },
+

  #ifdef HAVE_SYSLOG
      {
Index: include/storage/proc.h
===================================================================
RCS file: /home/snaga/cvsroot/pgsql/src/include/storage/proc.h,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 proc.h
*** include/storage/proc.h    10 Jun 2004 00:22:34 -0000    1.1.1.1
--- include/storage/proc.h    15 Jun 2004 08:20:34 -0000
***************
*** 88,93 ****
--- 88,94 ----

  /* configurable options */
  extern int    DeadlockTimeout;
+ extern int    LockTimeout;
  extern int    StatementTimeout;


***************
*** 111,118 ****
  extern void ProcCancelWaitForSignal(void);
  extern void ProcSendSignal(BackendId procId);

! extern bool enable_sig_alarm(int delayms, bool is_statement_timeout);
! extern bool disable_sig_alarm(bool is_statement_timeout);
  extern void handle_sig_alarm(SIGNAL_ARGS);

  #endif   /* PROC_H */
--- 112,120 ----
  extern void ProcCancelWaitForSignal(void);
  extern void ProcSendSignal(BackendId procId);

! extern bool enable_sig_alarm(int delayms, bool is_statement_timeout,
!                              bool is_lock_timeout);
! extern bool disable_sig_alarm(bool is_statement_timeout, bool is_lock_timeout);
  extern void handle_sig_alarm(SIGNAL_ARGS);

  #endif   /* PROC_H */
Index: include/utils/errcodes.h
===================================================================
RCS file: /home/snaga/cvsroot/pgsql/src/include/utils/errcodes.h,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 errcodes.h
*** include/utils/errcodes.h    10 Jun 2004 00:22:34 -0000    1.1.1.1
--- include/utils/errcodes.h    18 Jun 2004 06:29:09 -0000
***************
*** 203,208 ****
--- 203,211 ----
  #define ERRCODE_T_R_SERIALIZATION_FAILURE    MAKE_SQLSTATE('4','0', '0','0','1')
  #define ERRCODE_T_R_STATEMENT_COMPLETION_UNKNOWN    MAKE_SQLSTATE('4','0', '0','0','3')
  #define ERRCODE_T_R_DEADLOCK_DETECTED        MAKE_SQLSTATE('4','0', 'P','0','1')
+ #ifdef LOCKTIMEOUT
+  #define ERRCODE_T_R_LOCKTIMEOUT_DETECTED        MAKE_SQLSTATE('4','0', 'P','0','2')
+ #endif

  /* Class 42 - Syntax Error or Access Rule Violation */
  #define ERRCODE_SYNTAX_ERROR_OR_ACCESS_RULE_VIOLATION        MAKE_SQLSTATE('4','2', '0','0','0')

Re: lock timeout patch

From
Tom Lane
Date:
Satoshi Nagayasu <nagayasus@nttdata.co.jp> writes:
> When a transaction is blocked by another transaction because of
> waiting a lock, we need a lock timeout in some cases.

Isn't there an existing solution for this problem?
        regards, tom lane


Re: lock timeout patch

From
Satoshi Nagayasu
Date:
Tom,

I guess the transaction cancellation from the client
using PQrequestCancel() is available, but the cancellation
logic must be implemented in the client-application using
signal or thread.

I think detecting such situation on server-side is not
available now, and SQL Server or DB2 have same function.

Tom Lane wrote:
> Satoshi Nagayasu <nagayasus@nttdata.co.jp> writes:
> 
>>When a transaction is blocked by another transaction because of
>>waiting a lock, we need a lock timeout in some cases.
> 
> 
> Isn't there an existing solution for this problem?
> 
>             regards, tom lane
> 

-- 
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>


Re: lock timeout patch

From
Tom Lane
Date:
Satoshi Nagayasu <nagayasus@nttdata.co.jp> writes:
> I guess the transaction cancellation from the client
> using PQrequestCancel() is available, but the cancellation
> logic must be implemented in the client-application using
> signal or thread.

Actually I think the recommended solution involves using statement_timeout.
        regards, tom lane


Re: lock timeout patch

From
Satoshi Nagayasu
Date:
statement_timeout terminates large sort or scan
even if it is running, doesn't it?

statement_timeout doesn't care that
the process is waiting a lock or running.
I don't want to terminate a running query.

So a lock waiting backend shold be killed.

Tom Lane wrote:
> Satoshi Nagayasu <nagayasus@nttdata.co.jp> writes:
> 
>>I guess the transaction cancellation from the client
>>using PQrequestCancel() is available, but the cancellation
>>logic must be implemented in the client-application using
>>signal or thread.
> 
> 
> Actually I think the recommended solution involves using statement_timeout.
> 
>             regards, tom lane
> 


-- 
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>


Re: lock timeout patch

From
Tom Lane
Date:
Satoshi Nagayasu <nagayasus@nttdata.co.jp> writes:
> statement_timeout terminates large sort or scan
> even if it is running, doesn't it?

> statement_timeout doesn't care that
> the process is waiting a lock or running.
> I don't want to terminate a running query.

> So a lock waiting backend shold be killed.

This argument holds no water.  On what will you base your estimate of
a good value for lock_timeout?  It is nothing more than your estimate
of the statement runtime for some other backend that is currently
holding the lock you want ... an estimate which surely has less, not
more, reliability than the estimate you could make of the maximum
runtime of your own statement, because you have less information about
just what that other backend is doing.  (And both you and the other
backend are in turn dependent on waiting for locks held by third
parties, etc etc.)

I'd accept a mechanism to enforce a timeout at the lock level if you
could show me a convincing use-case for lock timeouts instead of
statement timeouts, but I don't believe there is one.  I think this
proposal is a solution in search of a problem.
        regards, tom lane


Re: lock timeout patch

From
Satoshi Nagayasu
Date:
Tom Lane wrote:
> I'd accept a mechanism to enforce a timeout at the lock level if you
> could show me a convincing use-case for lock timeouts instead of
> statement timeouts, but I don't believe there is one.  I think this
> proposal is a solution in search of a problem.

I think statement_timeout and lock_timeout are different.

If I set statement_timeout to 1000 to detect a lock timeout,
I can't run a query which takes over 1 sec.

If a lock wait is occured, I want to detect it immediately,
but I still want to run a long-running query.

-- 
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>


Re: lock timeout patch

From
Dennis Bjorklund
Date:
On Mon, 28 Jun 2004, Satoshi Nagayasu wrote:

> If I set statement_timeout to 1000 to detect a lock timeout,
> I can't run a query which takes over 1 sec.
> 
> If a lock wait is occured, I want to detect it immediately,
> but I still want to run a long-running query.

Why is it important what it is that makes your query not return as fast as
you expect? Maybe it's locking, maybe the computer is swapping, maybe it's
just lack of IO to that disk that holds the table, maybe it does a big
sort and have too little sort_mem to do it fast, ...

What makes locking special?

-- 
/Dennis Björklund



Re: lock timeout patch

From
Robert Treat
Date:
On Mon, 2004-06-28 at 02:16, Satoshi Nagayasu wrote:
> Tom Lane wrote:
> > I'd accept a mechanism to enforce a timeout at the lock level if you
> > could show me a convincing use-case for lock timeouts instead of
> > statement timeouts, but I don't believe there is one.  I think this
> > proposal is a solution in search of a problem.
> 
> I think statement_timeout and lock_timeout are different.
> 
> If I set statement_timeout to 1000 to detect a lock timeout,
> I can't run a query which takes over 1 sec.
> 
> If a lock wait is occured, I want to detect it immediately,
> but I still want to run a long-running query.
> 

How is your problem not solved by NOWAIT?
http://developer.postgresql.org/docs/postgres/sql-lock.html

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



Re: lock timeout patch

From
Satoshi Nagayasu
Date:
Robert Treat wrote:
>>I think statement_timeout and lock_timeout are different.
>>
>>If I set statement_timeout to 1000 to detect a lock timeout,
>>I can't run a query which takes over 1 sec.
>>
>>If a lock wait is occured, I want to detect it immediately,
>>but I still want to run a long-running query.
> 
> How is your problem not solved by NOWAIT?
> http://developer.postgresql.org/docs/postgres/sql-lock.html

I agree that it's one of the solutions when we use LOCK explicitly.
But LOCK does only lock a whole table, doesn't it?

-- 
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>


Re: lock timeout patch

From
Satoshi Nagayasu
Date:
Dennis Bjorklund wrote:
>>If I set statement_timeout to 1000 to detect a lock timeout,
>>I can't run a query which takes over 1 sec.
>>
>>If a lock wait is occured, I want to detect it immediately,
>>but I still want to run a long-running query.
> 
> 
> Why is it important what it is that makes your query not return as fast as
> you expect? Maybe it's locking, maybe the computer is swapping, maybe it's
> just lack of IO to that disk that holds the table, maybe it does a big
> sort and have too little sort_mem to do it fast, ...
> 
> What makes locking special?

Processing slow-down is just a hardware/software sizing issue.
Of course we need to fix it when a problem is occured,
but I think it's a different kind of problem.

In large databases, such as DSS(decision support system),
some queries takes one or more minutes. I think it's okey.

But I don't want to wait one or more minutes just for a lock.
I need to return a message to the user "retry later." or
something like that. It depends on various applications.

So I think detecting a lock waiting is important.

-- 
NAGAYASU Satoshi <nagayasus@nttdata.co.jp>


Re: lock timeout patch

From
Bruno Wolff III
Date:
On Tue, Jun 29, 2004 at 09:25:27 +0900, Satoshi Nagayasu <nagayasus@nttdata.co.jp> wrote:
> 
> But I don't want to wait one or more minutes just for a lock.
> I need to return a message to the user "retry later." or
> something like that. It depends on various applications.

Why not set statement timeout low when you are about to run a query
that you think should return quickly?


Re: lock timeout patch

From
Josh Berkus
Date:
Tom,

> I'd accept a mechanism to enforce a timeout at the lock level if you
> could show me a convincing use-case for lock timeouts instead of
> statement timeouts, but I don't believe there is one.  I think this
> proposal is a solution in search of a problem.

Hmmm ... didn't we argue this out with NOWAIT?   What did we conclude then?  
I'm reluctant to go over old ground repeatedly.

Let me say for myself that I would use this feature if it existed, but would 
not miss it a whole lot if the patch was rejected.    Here's the idea:

I have an OLAP database of regional office evaluations (in SQL Server, sadly) 
which requires that the evaluations, sometimes interlocking, of regions be 
"closed" simultaneously (in one transaction).   This means that during the 
closure process, certain kinds of data entry needs to be frozen out.   I am 
using SQL Server's lock timeout functionality for this; bascially, the data 
entry waits for 30 seconds, and then tells the user to try again in 10 
minutes.

I could do the same thing in PostgreSQL using NOWAIT and a loop on the client 
side.   But the lock timeout is somewhat easier.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


Re: lock timeout patch

From
Simon Riggs
Date:
On Tue, 2004-06-29 at 18:36, Josh Berkus wrote:
> Tom,
> 
> > I'd accept a mechanism to enforce a timeout at the lock level if you
> > could show me a convincing use-case for lock timeouts instead of
> > statement timeouts, but I don't believe there is one.  I think this
> > proposal is a solution in search of a problem.
> 
> Hmmm ... didn't we argue this out with NOWAIT?   What did we conclude then?  
> I'm reluctant to go over old ground repeatedly.
> 
> Let me say for myself that I would use this feature if it existed, but would 
> not miss it a whole lot if the patch was rejected.    Here's the idea:
> 
Can't vouch for the patch, but I can say this would get used...

> I have an ... database ...
> which requires that the evaluations, sometimes interlocking, of regions be 
> "closed" simultaneously (in one transaction).   This means that during the 
> closure process, certain kinds of data entry needs to be frozen out.   I am 
> using ... lock timeout functionality for this; bascially, the data 
> entry waits for 30 seconds, and then tells the user to try again in 10 
> minutes.

Just implementing this same scenario, using DB2 (...). Of course, if I
had MVCC on that application, I could argue that this is not
required...is that the basis of the "not required" view?

> 
> I could do the same thing in PostgreSQL using NOWAIT and a loop on the client 
> side.   But the lock timeout is somewhat easier.

SQLServer and DB2 support a lock timeout system wide, simple but not
granular. Oracle supports the NOWAIT option, even though it supports
readers-dont-block locking. I prefer the NOWAIT option as it gives a
more detailed handle on the exact statements that you wish to wait, or
not.

Without NOWAIT, we would need to set lock_timeout = 30 (seconds)

Statement level timeout is a different thing entirely, since there are
very often statements that need to run for 2-3 hours (even more in some
cases), so statement level timeout is set to 10000 (seconds).

Best Regards, Simon Riggs



Re: lock timeout patch

From
"Merlin Moncure"
Date:
> Tom,
>
> > I'd accept a mechanism to enforce a timeout at the lock level if you
> > could show me a convincing use-case for lock timeouts instead of
> > statement timeouts, but I don't believe there is one.  I think this
> > proposal is a solution in search of a problem.
>
> Hmmm ... didn't we argue this out with NOWAIT?   What did we conclude
> then?
> I'm reluctant to go over old ground repeatedly.

The result of this debate was that there was some use for it.  NOWAIT is
now implemented for table locking but not for row locking.

Personally I think there is some use for forcing transactions to abort
as soon as a lock situation is detected (although I probably wouldn't
use it).  For row level locking I would suggest to the original poster
to compare xmin/xmax (check the docs) to pre check the row level lock
condition.  This is inelegant but it mostly works.

FWIW, I think the treatment of locking in the docs could use some
improvement.  Especially wrt MVCC and pessimistic locking and the 'big
picture' issues going on there (especially why the former is better than
the latter).

Merlin


Re: lock timeout patch

From
Tom Lane
Date:
"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
> FWIW, I think the treatment of locking in the docs could use some
> improvement.  Especially wrt MVCC and pessimistic locking and the 'big
> picture' issues going on there (especially why the former is better than
> the latter).

Send a patch ...
        regards, tom lane