Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw
Date
Msg-id CAFjFpRdMD25Lia_J0P1eiKzNXAXtH9AGNx2vCZ2dBPNffNOaog@mail.gmail.com
Whole thread Raw
In response to [HACKERS] statement_timeout is not working as expected with postgres_fdw  (Suraj Kharage <suraj.kharage@enterprisedb.com>)
Responses Re: [HACKERS] statement_timeout is not working as expected withpostgres_fdw  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Apr 20, 2017 at 4:08 PM, Suraj Kharage
<suraj.kharage@enterprisedb.com> wrote:
> Hello All,
>
> The query on foreign table hangs due to network down of remote server for
> near about 16 minutes before exiting.
> statement_timeout is expected to work in this case as well but when i tried
> statement_timeout, it is not working as expected.
>
> Below is test case to reproduce the issue: [I am testing this on two
> different systems]
>
> 1: Set the postgres_fdw
>
> postgres=# CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS
> (host '172.16.173.237', dbname 'postgres', port '5432',keepalives '1',
> keepalives_interval '3',keepalives_idle '3', keepalives_count '1');
> CREATE SERVER
> postgres=# CREATE USER MAPPING FOR postgres SERVER myserver OPTIONS (user
> 'postgres', password 'edb');
> CREATE USER MAPPING
> postgres=# CREATE FOREIGN TABLE test(id int) SERVER myserver;
> CREATE FOREIGN TABLE
> postgres=# select * from test;
>  id
> ----
>   1
>  10
>   2
>   1
> (4 rows)
>
> postgres=# \d
>             List of relations
>  Schema | Name |     Type      |  Owner
> --------+------+---------------+----------
>  public | test | foreign table | postgres
> (1 row)
> postgres=#
> postgres=# select * from pg_foreign_server ;
>  srvname  | srvowner | srvfdw | srvtype | srvversion | srvacl |
> srvoptions
>
>
----------+----------+--------+---------+------------+--------+------------------------------------------------------------------------------------------------------------------
> -------
>  myserver |       10 |  16387 |         |            |        |
> {host=172.16.173.237,dbname=postgres,port=5432,keepalives=1,keepalives_interval=3,keepalives_idle=3,keepalives_co
> unt=1}
> (1 row)
>
> 3. Execute the insert command.
> E.g: insert into test values (generate_series(1,1000000));
> * You need to do network down of remote server during insert command.
>
> postgres=# set statement_timeout to 6000;
> SET
> postgres=# insert into test values (generate_series(1,1000000));
> WARNING:  server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
>
> CONTEXT:  Remote SQL command: ABORT TRANSACTION
> ERROR:  could not receive data from server: No route to host
>
> CONTEXT:  Remote SQL command: INSERT INTO public.test(id) VALUES ($1)
> Time: 931427.002 ms
>
> It was in hang state for approx 15-16 minute before exit.
>
> In pg_log, i see below error for above query:
> =========
> 2017-04-20 15:22:02 IST ERROR:  could not receive data from server: No route
> to host
> 2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: INSERT INTO
> public.test(id) VALUES ($1)
> 2017-04-20 15:22:02 IST STATEMENT:  insert into test values
> (generate_series(1,1000000));
> 2017-04-20 15:22:02 IST WARNING:  server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> 2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: ABORT TRANSACTION
> ==========
>
> I tested the patch submitted by Ashutosh Bapat on community
> [https://www.postgresql.org/message-id/CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g%40mail.gmail.com]
> to make the statement_timeout working and i can see statement_timeout is
> working but it is taking some extra time as statement_timeout.
>

Thanks for testing the patch.

> postgres=# set statement_timeout to 6000;
> SET
> postgres=#
> postgres=# \timing on
> Timing is on.
> postgres=# insert into test values (generate_series(1,1000000));
> -- [after executing query immediately disconnect the ethernet on remote
> server ]
> 2017-04-20 07:10:51.588 IST [10467] ERROR:  canceling statement due to
> statement timeout
> 2017-04-20 07:10:51.588 IST [10467] STATEMENT:  insert into test values
> (generate_series(1,1000000));
> 2017-04-20 07:11:25.590 IST [10467] WARNING:  discarding connection 0xfe4260
> because of failed cancel request: PQcancel() -- connect() failed: No route
> to host
> WARNING:  discarding connection 0xfe4260 because of failed cancel request:
> PQcancel() -- connect() failed: No route to host
>
> ERROR:  canceling statement due to statement timeout
> Time: 40001.765 ms (00:40.002)
> postgres=#
> postgres=#
>
> In above case, I got the error related to statement timeout after 6 seconds,
> but it it taking more time (approx 34 sec and it is varing each time if you
> see below) to terminate the connection and to exit from query. As per the
> tcp keepavlies settings for foreign server, it should take max 6 sec to
> terminate the connection.

The logs above show that 34 seconds elapsed between starting to abort
the transaction and knowing that the foreign server is unreachable. It
looks like it took that much time for the local server to realise that
the foreign server is not reachable. Looking at PQcancel code, it
seems to be trying to connect to the foreign server to cancel the
query. But somehow it doesn't seem to honor connect_timeout setting.
Is that expected?

Irrespective of what PQcancel does, it looks like postgres_fdw should
just slam the connection if query is being aborted because of
statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
a way to know whether this ABORT is because of user's request to
cancel the query, statement timeout, an abort because of some other
error or a user requested abort. Except statement timeout (may be
user's request to cancel the query?), it should try to keep the
connection around to avoid any future reconnection. But I am not able
to see how can we provide that information to pgfdw_xact_callback().

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: [HACKERS] DROP SUBSCRIPTION, query cancellations and slothandling
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW