Re: statement_timeout affects query results fetching? - Mailing list pgsql-hackers

From Robert Haas
Subject Re: statement_timeout affects query results fetching?
Date
Msg-id CA+TgmoYaGwJWS27h+430WHcZy85=FUuxwvRmJsBk3kizAg+zFg@mail.gmail.com
Whole thread Raw
In response to statement_timeout affects query results fetching?  (Shay Rojansky <roji@roji.org>)
List pgsql-hackers
On Sat, Aug 8, 2015 at 5:31 AM, Shay Rojansky <roji@roji.org> wrote:
> Hi everyone, I'm seeing some strange behavior and wanted to confirm it.
>
> When executing a query that selects a long result set, if the code
> processing the results takes its time (i.e.g more than statement_timeout), a
> timeout occurs. My expectation was that statement_timeout only affects query
> *processing*, and does not cover the frontend actually processing the
> result.
>
> First, I wanted to confirm that this is the case (and not some sort of bug
> in my code).
>
> If this is the case, is this intended? Here are some points:
> * It makes statement_timeout very difficult to use; clients do very diverse
> things with database results, it may be very difficult (or impossible) to
> estimate how much time they should take (e.g. random load factors on the
> client machine or on some other machine receiving results).
> * It makes it impossible to specifically detect problematic *queries* which
> take too long to execute (as opposed to the time taken to process their
> results).
>
> If you do insist that this behavior is correct, a documentation update for
> statement_timeout might make this clearer.

I think the issue here is that we start returning rows to the client
while the query is still executing.

Suppose each row of output takes 100 ms of CPU time to generate and
there are 1,000,000 rows.  Then it's quite conceivable that we could
be under the statement_timeout when we start returning rows to the
client, but over the statement_timeout by the time we finish -- and
the user would probably want statement_timeout to kick in in that
case, because that's a lotta CPU time.

I suppose we could try to toll statement_timeout while we're blocked
waiting for the client, but nobody wrote the code for that yet.  And
it would mean that you can't use statement_timeout to prevent xmin
from lagging, which could be why you set it in the first place.  There
might also be some usability difficulties: pg_stat_activity shows the
time the query started, so if you know what statement_timeout is you
can tell how close it is to being killed.  If some of the time isn't
counted, then you can't tell any more.

Another approach (which I think might be better) is to have GUCs like
statement_cpu_limit and statement_io_limit that kill a query when it
uses more than the configured amount of that resource.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: tap tests remove working directories
Next
From: Robert Haas
Date:
Subject: Re: [DOCS] max_worker_processes on the standby