Thread: statement_timeout logging

statement_timeout logging

From
Simon Riggs
Date:
Currently, when we set a statement_timeout and a query runs over that
time there is no log message to say that the statement has timed out. We
do get a message which says
    ERROR:  canceling query due to user request
and so in the server log it is impossible to tell the difference between
a query that has been cancelled deliberately by the user and a query
whose time(out) has come.

The timeout causes the query to be cancelled, which is an ERROR, so it
is possible that the query will be logged if log_min_error_statement is
set to ERROR or below. Again, there is no way to tell the difference
between an ERROR statement that has occurred during execution and an
ERROR statement that is generated by a syntax error. So, if you do set
log_min_error_statement=ERROR then you get swamped by syntax errors
rather than being able to see statement timeouts.

Logging these queries ought to be a valuable source of tuning
information, but as a result of the above, this is not currently the
case.

For now, I've written a quick patch to the log the query text with a
suitable error message.

I expect some debate as to whether the line
           if (log_min_error_statement > WARNING)
should be there at all, so comments are sought.

---

Longer term, we might handle this differently. If we differentiate
between two types of ERROR, at least for log_min_error_statement...
- one that prevents the query from ever running (syntax, parse errors)
- one that occurs because the query itself gets into difficulty, but not
so badly that it constitutes a FATAL error. Lets call it RUNERROR for
now.
I expect most people to want to log RUNERRORs, but to want to ignore the
more normal kind of ERRORs.

When/if we can differentiate between levels, I'll reverse out this patch
since we'll be able to handle it simply with the new errorlevel.

Best Regards, Simon Riggs

Attachment

Re: statement_timeout logging

From
Bruce Momjian
Date:
Simon Riggs wrote:
> Currently, when we set a statement_timeout and a query runs over that
> time there is no log message to say that the statement has timed out. We
> do get a message which says 
>     ERROR:  canceling query due to user request
> and so in the server log it is impossible to tell the difference between
> a query that has been cancelled deliberately by the user and a query
> whose time(out) has come.

I have updated the message to:
    errmsg("canceling query due to user request or statement timeout")));

> The timeout causes the query to be cancelled, which is an ERROR, so it
> is possible that the query will be logged if log_min_error_statement is
> set to ERROR or below. Again, there is no way to tell the difference
> between an ERROR statement that has occurred during execution and an
> ERROR statement that is generated by a syntax error. So, if you do set
> log_min_error_statement=ERROR then you get swamped by syntax errors
> rather than being able to see statement timeouts.
> 
> Logging these queries ought to be a valuable source of tuning
> information, but as a result of the above, this is not currently the
> case.
> 
> For now, I've written a quick patch to the log the query text with a
> suitable error message.
> 
> I expect some debate as to whether the line
>            if (log_min_error_statement > WARNING)
> should be there at all, so comments are sought.

I don't see why printing the query cancel from a timeout is any more
special than a user request for cancel or a simple query error.  If
users want statements to be printed on error, they will configure things
that way, if not, we should not print them.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: statement_timeout logging

From
Bruce Momjian
Date:
pgman wrote:
> Simon Riggs wrote:
> > Currently, when we set a statement_timeout and a query runs over that
> > time there is no log message to say that the statement has timed out. We
> > do get a message which says
> >     ERROR:  canceling query due to user request
> > and so in the server log it is impossible to tell the difference between
> > a query that has been cancelled deliberately by the user and a query
> > whose time(out) has come.
>
> I have updated the message to:
>
>      errmsg("canceling query due to user request or statement timeout")));

Oops.  Did we freeze the message strings already for this release?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: statement_timeout logging

From
Alvaro Herrera
Date:
On Fri, Sep 16, 2005 at 03:41:11PM -0400, Bruce Momjian wrote:
> pgman wrote:

> > I have updated the message to:
> >
> >      errmsg("canceling query due to user request or statement timeout")));
>
> Oops.  Did we freeze the message strings already for this release?

Not yet.

I'm not sure I agree with the wording though -- is there no way to tell
the difference?

--
Alvaro Herrera -- Valdivia, Chile         Architect, www.EnterpriseDB.com
"No single strategy is always right (Unless the boss says so)"
                                                  (Larry Wall)

Re: statement_timeout logging

From
"Simon Riggs"
Date:
> From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> I don't see why printing the query cancel from a timeout is any more
> special than a user request for cancel or a simple query error.  If
> users want statements to be printed on error, they will
> configure things
> that way, if not, we should not print them.

The log is for admins, not users.

If an admin sets statement_timeout, then it is a good idea to have the
SQL logged as a way of gaining performance information  for the *admin*,
who knows less about what users do, yet is still charged with the need
to tune if one user is effecting other users/jobs. That's a different
scenario than a user cancelling their query. The user knows they've
cancelled it, so they already know the SQL and can ask the admin if they
want it faster.  Generally the admin doesn't care if a user cancelled a
query, since it might be for a whole host of reasons, whereas a query
cancelled for statement_timeout has one specific cause.

So, I still want this functionality, so the "bad" query is in the logs.

Best Regards, Simon Riggs



Re: [DOCS] statement_timeout logging

From
"Simon Riggs"
Date:
>Alvaro Herrera
> On Fri, Sep 16, 2005 at 03:41:11PM -0400, Bruce Momjian wrote:
> > > I have updated the message to:
> > >
> > >      errmsg("canceling query due to user request or
> statement timeout")));
> >
> > Oops.  Did we freeze the message strings already for this release?
>
> Not yet.
>
> I'm not sure I agree with the wording though -- is there no
> way to tell
> the difference?

Yes, my patch. :-)

The message is generated following receipt of a SIGINT which is sent by
kill() after the statement_timeout.

My patch was able to generate a different message by logging just before
the kill() is called. (Obviously, there isn't much "after" in this
circumstance).

Best Regards, Simon Riggs


Re: statement_timeout logging

From
Bruce Momjian
Date:
Simon Riggs wrote:
> > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
> > I don't see why printing the query cancel from a timeout is any more
> > special than a user request for cancel or a simple query error.  If
> > users want statements to be printed on error, they will
> > configure things
> > that way, if not, we should not print them.
> 
> The log is for admins, not users.
> 
> If an admin sets statement_timeout, then it is a good idea to have the
> SQL logged as a way of gaining performance information  for the *admin*,
> who knows less about what users do, yet is still charged with the need
> to tune if one user is effecting other users/jobs. That's a different
> scenario than a user cancelling their query. The user knows they've
> cancelled it, so they already know the SQL and can ask the admin if they
> want it faster.  Generally the admin doesn't care if a user cancelled a
> query, since it might be for a whole host of reasons, whereas a query
> cancelled for statement_timeout has one specific cause.
> 
> So, I still want this functionality, so the "bad" query is in the logs.

Well, the user can use use statement_timeout too, so it isn't just the
admin.  Also, logging queries can be a security issue, so the idea of
logging the query on duration timeout by default seems like a bad idea. 

We can go three ways.  We can add a boolean GUC to control printing of
the query during a timeout, but that seems like overkill.  We can add a
new level for log_min_error_statement that is just above error, but that
seems confusing.  I think the right solution would be to allow
log_min_duration_statement to work for canceled queries.  Right now,
log_min_duration_statement doesn't work for canceled queries because the
query never completes to give a final duration and hit the test code. 
Should that be fixed now or added to the TODO list?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: [DOCS] statement_timeout logging

From
Bruce Momjian
Date:
Simon Riggs wrote:
> >Alvaro Herrera
> > On Fri, Sep 16, 2005 at 03:41:11PM -0400, Bruce Momjian wrote:
> > > > I have updated the message to:
> > > >
> > > >      errmsg("canceling query due to user request or
> > statement timeout")));
> > >
> > > Oops.  Did we freeze the message strings already for this release?
> >
> > Not yet.
> >
> > I'm not sure I agree with the wording though -- is there no
> > way to tell
> > the difference?
>
> Yes, my patch. :-)
>
> The message is generated following receipt of a SIGINT which is sent by
> kill() after the statement_timeout.
>
> My patch was able to generate a different message by logging just before
> the kill() is called. (Obviously, there isn't much "after" in this
> circumstance).

Your patch just prints a statement "before" printing the cancel message.
What we should do is to print the proper cause of the cancel, and I am
working on a patch for that.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: statement_timeout logging

From
Simon Riggs
Date:
On Fri, 2005-09-16 at 20:48 -0400, Bruce Momjian wrote:
> We can go three ways.  We can add a boolean GUC to control printing of
> the query during a timeout, but that seems like overkill.  We can add a
> new level for log_min_error_statement that is just above error, but that
> seems confusing.  I think the right solution would be to allow
> log_min_duration_statement to work for canceled queries.  Right now,
> log_min_duration_statement doesn't work for canceled queries because the
> query never completes to give a final duration and hit the test code. 
> Should that be fixed now or added to the TODO list?

The last one seems the right way to go.  

So, reformat the message at statement_timeout, so that the log looks
exactly like log_min_duration_statement:

e.g.

LOG: statement_timeout has been activated to cancel statement
LOG: duration 1625652ms statement SELECT * from bigOne
LOG: query has been cancelled by user action

Perhaps we should change the message from kill() to be "statement"
rather than "query" also...

I'd vote fix now, but I guess that seems to be becoming a regular
viewpoint from me.

Best Regards, Simon Riggs





Re: statement_timeout logging

From
Bruce Momjian
Date:
Simon Riggs wrote:
> On Fri, 2005-09-16 at 20:48 -0400, Bruce Momjian wrote:
> > We can go three ways.  We can add a boolean GUC to control printing of
> > the query during a timeout, but that seems like overkill.  We can add a
> > new level for log_min_error_statement that is just above error, but that
> > seems confusing.  I think the right solution would be to allow
> > log_min_duration_statement to work for canceled queries.  Right now,
> > log_min_duration_statement doesn't work for canceled queries because the
> > query never completes to give a final duration and hit the test code.
> > Should that be fixed now or added to the TODO list?
>
> The last one seems the right way to go.

OK.  I tested it and it actually works, and I added documentation
suggesting its usage.  I was a little confused above because the
STATEMENT: line is only output to the server logs because of the way
elog.c handles a STATEMENT print for log_min_error_statement.  It does
not output to the client no matter what log_min_messages is set to, and
if someone is concerned about that we can fix it.  Technically STATEMENT
is not a log level message.

So, if I do:

    test=> SET statement_timeout = 1;
    SET
    test=> SET log_min_error_statement = 'ERROR';
    SET
    test=> SELECT * FROM pg_class;
    ERROR:  canceling statement due to statement timeout

in the logs I see:

    ERROR:  canceling statement due to statement timeout
    STATEMENT:  select * from pg_class;

> So, reformat the message at statement_timeout, so that the log looks
> exactly like log_min_duration_statement:
>
> e.g.
>
> LOG: statement_timeout has been activated to cancel statement
> LOG: duration 1625652ms statement SELECT * from bigOne
> LOG: query has been cancelled by user action

Not really.  The problem here is that the last line is wrong --- it was
not cancelled by user action.  The attached, applied patch adds a
"cancel_from_timeout" variable that properly records if the cancel was
because of a timeout or user interaction, and displays the proper log
message.

> Perhaps we should change the message from kill() to be "statement"
> rather than "query" also...

kill() is the wrong place to print the message.  You will see the
attached patch does it in a cleaner way.

> I'd vote fix now, but I guess that seems to be becoming a regular
> viewpoint from me.

OK, fixed.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/config.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.17
diff -c -c -r1.17 config.sgml
*** doc/src/sgml/config.sgml    13 Sep 2005 15:24:56 -0000    1.17
--- doc/src/sgml/config.sgml    19 Sep 2005 16:58:53 -0000
***************
*** 3232,3238 ****
        <listitem>
         <para>
          Abort any statement that takes over the specified number of
!         milliseconds.  A value of zero (the default) turns off the limitation.
         </para>
        </listitem>
       </varlistentry>
--- 3232,3241 ----
        <listitem>
         <para>
          Abort any statement that takes over the specified number of
!         milliseconds.  If <varname>log_min_error_statement</> is set to
!         <literal>ERROR</> or lower, the statement that timed out will also be
!         logged.  A value of zero (the default) turns off the
!         limitation.
         </para>
        </listitem>
       </varlistentry>
Index: src/backend/storage/lmgr/proc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.163
diff -c -c -r1.163 proc.c
*** src/backend/storage/lmgr/proc.c    20 Aug 2005 23:26:24 -0000    1.163
--- src/backend/storage/lmgr/proc.c    19 Sep 2005 16:58:55 -0000
***************
*** 78,83 ****
--- 78,84 ----
  /* Mark these volatile because they can be changed by signal handler */
  static volatile bool statement_timeout_active = false;
  static volatile bool deadlock_timeout_active = false;
+ volatile bool cancel_from_timeout = false;

  /* statement_fin_time is valid only if statement_timeout_active is true */
  static struct timeval statement_fin_time;
***************
*** 1058,1063 ****
--- 1059,1065 ----
          Assert(!deadlock_timeout_active);
          statement_fin_time = fin_time;
          statement_timeout_active = true;
+         cancel_from_timeout = false;
      }
      else if (statement_timeout_active)
      {
***************
*** 1128,1141 ****
          MemSet(&timeval, 0, sizeof(struct itimerval));
          if (setitimer(ITIMER_REAL, &timeval, NULL))
          {
!             statement_timeout_active = deadlock_timeout_active = false;
              return false;
          }
  #else
          /* BeOS doesn't have setitimer, but has set_alarm */
          if (set_alarm(B_INFINITE_TIMEOUT, B_PERIODIC_ALARM) < 0)
          {
!             statement_timeout_active = deadlock_timeout_active = false;
              return false;
          }
  #endif
--- 1130,1147 ----
          MemSet(&timeval, 0, sizeof(struct itimerval));
          if (setitimer(ITIMER_REAL, &timeval, NULL))
          {
!             statement_timeout_active = false;
!             cancel_from_timeout = false;
!             deadlock_timeout_active = false;
              return false;
          }
  #else
          /* BeOS doesn't have setitimer, but has set_alarm */
          if (set_alarm(B_INFINITE_TIMEOUT, B_PERIODIC_ALARM) < 0)
          {
!             statement_timeout_active = false;
!             cancel_from_timeout = false;
!             deadlock_timeout_active = false;
              return false;
          }
  #endif
***************
*** 1146,1152 ****
--- 1152,1161 ----

      /* Cancel or reschedule statement timeout */
      if (is_statement_timeout)
+     {
          statement_timeout_active = false;
+         cancel_from_timeout = false;
+     }
      else if (statement_timeout_active)
      {
          if (!CheckStatementTimeout())
***************
*** 1179,1184 ****
--- 1188,1194 ----
      {
          /* Time to die */
          statement_timeout_active = false;
+         cancel_from_timeout = true;
          kill(MyProcPid, SIGINT);
      }
      else
Index: src/backend/tcop/postgres.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/tcop/postgres.c,v
retrieving revision 1.459
diff -c -c -r1.459 postgres.c
*** src/backend/tcop/postgres.c    16 Sep 2005 19:31:04 -0000    1.459
--- src/backend/tcop/postgres.c    19 Sep 2005 16:58:58 -0000
***************
*** 1979,1985 ****
          /* Set statement timeout running, if any */
          if (StatementTimeout > 0)
              enable_sig_alarm(StatementTimeout, true);
!
          xact_started = true;
      }
  }
--- 1979,1987 ----
          /* Set statement timeout running, if any */
          if (StatementTimeout > 0)
              enable_sig_alarm(StatementTimeout, true);
!         else
!             cancel_from_timeout = false;
!
          xact_started = true;
      }
  }
***************
*** 2203,2211 ****
          ImmediateInterruptOK = false;    /* not idle anymore */
          DisableNotifyInterrupt();
          DisableCatchupInterrupt();
!         ereport(ERROR,
!                 (errcode(ERRCODE_QUERY_CANCELED),
!                  errmsg("canceling query due to user request or statement timeout")));
      }
      /* If we get here, do nothing (probably, QueryCancelPending was reset) */
  }
--- 2205,2218 ----
          ImmediateInterruptOK = false;    /* not idle anymore */
          DisableNotifyInterrupt();
          DisableCatchupInterrupt();
!         if (cancel_from_timeout)
!             ereport(ERROR,
!                     (errcode(ERRCODE_QUERY_CANCELED),
!                      errmsg("canceling statement due to statement timeout")));
!         else
!             ereport(ERROR,
!                     (errcode(ERRCODE_QUERY_CANCELED),
!                      errmsg("canceling statement due to user request")));
      }
      /* If we get here, do nothing (probably, QueryCancelPending was reset) */
  }
Index: src/include/storage/proc.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/storage/proc.h,v
retrieving revision 1.81
diff -c -c -r1.81 proc.h
*** src/include/storage/proc.h    20 Aug 2005 23:26:34 -0000    1.81
--- src/include/storage/proc.h    19 Sep 2005 16:58:59 -0000
***************
*** 117,122 ****
--- 117,124 ----
  extern int    DeadlockTimeout;
  extern int    StatementTimeout;

+ extern volatile bool cancel_from_timeout;
+

  /*
   * Function Prototypes
Index: src/test/regress/expected/prepared_xacts.out
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/expected/prepared_xacts.out,v
retrieving revision 1.2
diff -c -c -r1.2 prepared_xacts.out
*** src/test/regress/expected/prepared_xacts.out    16 Sep 2005 19:39:02 -0000    1.2
--- src/test/regress/expected/prepared_xacts.out    19 Sep 2005 16:59:00 -0000
***************
*** 159,165 ****
  -- pxtest3 should be locked because of the pending DROP
  set statement_timeout to 1000;
  SELECT * FROM pxtest3;
! ERROR:  canceling query due to user request or statement timeout
  reset statement_timeout;
  -- Disconnect, we will continue testing in a different backend
  \c -
--- 159,165 ----
  -- pxtest3 should be locked because of the pending DROP
  set statement_timeout to 1000;
  SELECT * FROM pxtest3;
! ERROR:  canceling statement due to statement timeout
  reset statement_timeout;
  -- Disconnect, we will continue testing in a different backend
  \c -
***************
*** 174,180 ****
  -- pxtest3 should still be locked because of the pending DROP
  set statement_timeout to 1000;
  SELECT * FROM pxtest3;
! ERROR:  canceling query due to user request or statement timeout
  reset statement_timeout;
  -- Commit table creation
  COMMIT PREPARED 'regress-one';
--- 174,180 ----
  -- pxtest3 should still be locked because of the pending DROP
  set statement_timeout to 1000;
  SELECT * FROM pxtest3;
! ERROR:  canceling statement due to statement timeout
  reset statement_timeout;
  -- Commit table creation
  COMMIT PREPARED 'regress-one';

Re: statement_timeout logging

From
Simon Riggs
Date:
On Mon, 2005-09-19 at 13:21 -0400, Bruce Momjian wrote:

> OK.  I tested it and it actually works

Gosh, that says a lot about my code quality. I will strive to improve
from "actually works" to "works as expected" for future patches.

> and I added documentation
> suggesting its usage. 

> !           If <varname>log_min_error_statement</> is set to
> !         <literal>ERROR</> or lower, the statement that timed out will also be
> 

Might I suggest "Even if" rather than "if"?

Best Regards, Simon Riggs




Re: statement_timeout logging

From
Bruce Momjian
Date:
Simon Riggs wrote:
> On Mon, 2005-09-19 at 13:21 -0400, Bruce Momjian wrote:
> 
> > OK.  I tested it and it actually works
> 
> Gosh, that says a lot about my code quality. I will strive to improve
> from "actually works" to "works as expected" for future patches.
> 
> > and I added documentation
> > suggesting its usage. 
> 
> > !           If <varname>log_min_error_statement</> is set to
> > !         <literal>ERROR</> or lower, the statement that timed out will also be
> > 
> 
> Might I suggest "Even if" rather than "if"?

I don't see why "even if" makes sense here.  There is no expectation
that log_min_error_statement wouldn't work as documented.  I am more
suggesting that it is a nifity way to use the option.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073