Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw
Date
Msg-id CAFjFpRe6uf3-nU-orTEAJ0_CKb3MMiPQ5AHJ_SwDguV7Sjs6Ww@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] statement_timeout is not working as expected withpostgres_fdw  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [HACKERS] statement_timeout is not working as expected withpostgres_fdw  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On Tue, Apr 25, 2017 at 1:31 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>>
>> The logs above show that 34 seconds elapsed between starting to abort
>> the transaction and knowing that the foreign server is unreachable. It
>> looks like it took that much time for the local server to realise that
>> the foreign server is not reachable. Looking at PQcancel code, it
>> seems to be trying to connect to the foreign server to cancel the
>> query. But somehow it doesn't seem to honor connect_timeout setting.
>> Is that expected?
>
> Yes, and No. I think PQcancel requires connection timeout, but I
> think it is not necessariry the same with that of a foreign
> server.

Since connect_timeout is property of foreign server, it should be
honored by any connection made to that server from local server
including the one by PQcancel().

>
>> Irrespective of what PQcancel does, it looks like postgres_fdw should
>> just slam the connection if query is being aborted because of
>> statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
>> a way to know whether this ABORT is because of user's request to
>> cancel the query, statement timeout, an abort because of some other
>> error or a user requested abort. Except statement timeout (may be
>> user's request to cancel the query?), it should try to keep the
>> connection around to avoid any future reconnection. But I am not able
>> to see how can we provide that information to pgfdw_xact_callback().
>
> Expiration of statement_timeout doesn't mean a stall of foreign
> connections. If we are to keep connections by, for example, a
> cancel request from client, we also should keep them on
> statememt_timeout because it is not necessariry triggered by a
> stall of foreign connection.

When statement_timeout completes, we don't want to spend more time in
trying to cancel queries: esp when there are many foreign server, each
consuming some "timeout" time OR even trying to send Abort transaction
statement. Instead, we should slam those down. I consider this to be
different from query cancellation since query cancellation doesn't
have a hard bound on time, although we would like to cancel the
running query as fast as possible. Rethinking about it, probably we
should slam down the connection in case of query cancel as well.
>
> I think we can detect a stall of the channel where the foreign
> connections are on by a cancel request with a very short timeout,
> although it is a bit incorrect.
>
> I reconsider this problem and my proposal for this issue is as
> the follows.
>
> - Foreign servers have a new options 'stall_detection_threshold'
>   in milliseconds, maybe defaults to connect_timeout of the
>   foreign server setting. For many foreign servers in a local
>   network, it could be lowered to several tens of milliseconds.

A connect_timeout less than 2 seconds is not encouraged.
https://www.postgresql.org/docs/devel/static/libpq-connect.html. So,
we can not set stall_detection_threshold to be smaller than 2 seconds.
statement_timeout however is set in milliseconds, so 2 seconds per
connection would be quite a lot compared to statement_timeout setting.
Waiting to cancel query for 2 seconds when the statement_timeout
itself is 2 seconds would mean the query would be cancelled after 4
seconds, which is kind of funny.



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Rajkumar Raghuwanshi
Date:
Subject: Re: [HACKERS] Declarative partitioning - another take
Next
From: 高增琦
Date:
Subject: Re: [HACKERS] Dropping a partitioned table takes too long