Thread: [HACKERS] statement_timeout is not working as expected with postgres_fdw

[HACKERS] statement_timeout is not working as expected with postgres_fdw

From
Suraj Kharage
Date:
Hello All,

The query on foreign table hangs due to network down of remote server for near about 16 minutes before exiting. 
statement_timeout is expected to work in this case as well but when i tried statement_timeout, it is not working as expected.

Below is test case to reproduce the issue: [I am testing this on two different systems]

1: Set the postgres_fdw 

postgres=# CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '172.16.173.237', dbname 'postgres', port '5432',keepalives '1', keepalives_interval '3',keepalives_idle '3', keepalives_count '1');
CREATE SERVER
postgres=# CREATE USER MAPPING FOR postgres SERVER myserver OPTIONS (user 'postgres', password 'edb');
CREATE USER MAPPING
postgres=# CREATE FOREIGN TABLE test(id int) SERVER myserver;
CREATE FOREIGN TABLE
postgres=# select * from test;
 id 
----
  1
 10
  2
  1
(4 rows)

postgres=# \d
            List of relations
 Schema | Name |     Type      |  Owner   
--------+------+---------------+----------
 public | test | foreign table | postgres
(1 row)
postgres=# 
postgres=# select * from pg_foreign_server ;
 srvname  | srvowner | srvfdw | srvtype | srvversion | srvacl |                                                       srvoptions                                                 
       
----------+----------+--------+---------+------------+--------+------------------------------------------------------------------------------------------------------------------
-------
 myserver |       10 |  16387 |         |            |        | {host=172.16.173.237,dbname=postgres,port=5432,keepalives=1,keepalives_interval=3,keepalives_idle=3,keepalives_co
unt=1}
(1 row)

3. Execute the insert command.
E.g: insert into test values (generate_series(1,1000000));
* You need to do network down of remote server during insert command.

postgres=# set statement_timeout to 6000;
SET
postgres=# insert into test values (generate_series(1,1000000));
WARNING:  server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

CONTEXT:  Remote SQL command: ABORT TRANSACTION
ERROR:  could not receive data from server: No route to host

CONTEXT:  Remote SQL command: INSERT INTO public.test(id) VALUES ($1)
Time: 931427.002 ms

It was in hang state for approx 15-16 minute before exit.

In pg_log, i see below error for above query:
=========
2017-04-20 15:22:02 IST ERROR:  could not receive data from server: No route to host
2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: INSERT INTO public.test(id) VALUES ($1)
2017-04-20 15:22:02 IST STATEMENT:  insert into test values (generate_series(1,1000000));
2017-04-20 15:22:02 IST WARNING:  server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: ABORT TRANSACTION
==========

I tested the patch submitted by Ashutosh Bapat on community [https://www.postgresql.org/message-id/CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g%40mail.gmail.com] to make the statement_timeout working and i can see statement_timeout is working but it is taking some extra time as statement_timeout.

postgres=# set statement_timeout to 6000;
SET
postgres=# 
postgres=# \timing on
Timing is on.
postgres=# insert into test values (generate_series(1,1000000));           -- [after executing query immediately disconnect the ethernet on remote server ]
2017-04-20 07:10:51.588 IST [10467] ERROR:  canceling statement due to statement timeout
2017-04-20 07:10:51.588 IST [10467] STATEMENT:  insert into test values (generate_series(1,1000000));
2017-04-20 07:11:25.590 IST [10467] WARNING:  discarding connection 0xfe4260 because of failed cancel request: PQcancel() -- connect() failed: No route to host
WARNING:  discarding connection 0xfe4260 because of failed cancel request: PQcancel() -- connect() failed: No route to host

ERROR:  canceling statement due to statement timeout
Time: 40001.765 ms (00:40.002)
postgres=# 
postgres=# 

In above case, I got the error related to statement timeout after 6 seconds,
but it it taking more time (approx 34 sec and it is varing each time if you see below) to terminate the connection and to exit from query. As per the tcp keepavlies settings for foreign server, it should take max 6 sec to terminate the connection.


postgres=# 
postgres=# set statement_timeout to 20000;
SET
Time: 0.254 ms
postgres=# 
postgres=# insert into test values (generate_series(1,1000000));
2017-04-20 07:13:25.666 IST [10467] ERROR:  canceling statement due to statement timeout
2017-04-20 07:13:25.666 IST [10467] STATEMENT:  insert into test values (generate_series(1,1000000));
2017-04-20 07:13:43.668 IST [10467] WARNING:  discarding connection 0xfe4260 because of failed cancel request: PQcancel() -- connect() failed: No route to host
WARNING:  discarding connection 0xfe4260 because of failed cancel request: PQcancel() -- connect() failed: No route to host

ERROR:  canceling statement due to statement timeout
Time: 38004.169 ms (00:38.004)
postgres=# 


When I tested this using a local TCP connection, then query was exited immediately after statement timeout.


Suraj Kharage
EnterpriseDB Corporation 
The Postgres Database Company 

Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

From
Ashutosh Bapat
Date:
On Thu, Apr 20, 2017 at 4:08 PM, Suraj Kharage
<suraj.kharage@enterprisedb.com> wrote:
> Hello All,
>
> The query on foreign table hangs due to network down of remote server for
> near about 16 minutes before exiting.
> statement_timeout is expected to work in this case as well but when i tried
> statement_timeout, it is not working as expected.
>
> Below is test case to reproduce the issue: [I am testing this on two
> different systems]
>
> 1: Set the postgres_fdw
>
> postgres=# CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS
> (host '172.16.173.237', dbname 'postgres', port '5432',keepalives '1',
> keepalives_interval '3',keepalives_idle '3', keepalives_count '1');
> CREATE SERVER
> postgres=# CREATE USER MAPPING FOR postgres SERVER myserver OPTIONS (user
> 'postgres', password 'edb');
> CREATE USER MAPPING
> postgres=# CREATE FOREIGN TABLE test(id int) SERVER myserver;
> CREATE FOREIGN TABLE
> postgres=# select * from test;
>  id
> ----
>   1
>  10
>   2
>   1
> (4 rows)
>
> postgres=# \d
>             List of relations
>  Schema | Name |     Type      |  Owner
> --------+------+---------------+----------
>  public | test | foreign table | postgres
> (1 row)
> postgres=#
> postgres=# select * from pg_foreign_server ;
>  srvname  | srvowner | srvfdw | srvtype | srvversion | srvacl |
> srvoptions
>
>
----------+----------+--------+---------+------------+--------+------------------------------------------------------------------------------------------------------------------
> -------
>  myserver |       10 |  16387 |         |            |        |
> {host=172.16.173.237,dbname=postgres,port=5432,keepalives=1,keepalives_interval=3,keepalives_idle=3,keepalives_co
> unt=1}
> (1 row)
>
> 3. Execute the insert command.
> E.g: insert into test values (generate_series(1,1000000));
> * You need to do network down of remote server during insert command.
>
> postgres=# set statement_timeout to 6000;
> SET
> postgres=# insert into test values (generate_series(1,1000000));
> WARNING:  server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
>
> CONTEXT:  Remote SQL command: ABORT TRANSACTION
> ERROR:  could not receive data from server: No route to host
>
> CONTEXT:  Remote SQL command: INSERT INTO public.test(id) VALUES ($1)
> Time: 931427.002 ms
>
> It was in hang state for approx 15-16 minute before exit.
>
> In pg_log, i see below error for above query:
> =========
> 2017-04-20 15:22:02 IST ERROR:  could not receive data from server: No route
> to host
> 2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: INSERT INTO
> public.test(id) VALUES ($1)
> 2017-04-20 15:22:02 IST STATEMENT:  insert into test values
> (generate_series(1,1000000));
> 2017-04-20 15:22:02 IST WARNING:  server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> 2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: ABORT TRANSACTION
> ==========
>
> I tested the patch submitted by Ashutosh Bapat on community
> [https://www.postgresql.org/message-id/CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g%40mail.gmail.com]
> to make the statement_timeout working and i can see statement_timeout is
> working but it is taking some extra time as statement_timeout.
>

Thanks for testing the patch.

> postgres=# set statement_timeout to 6000;
> SET
> postgres=#
> postgres=# \timing on
> Timing is on.
> postgres=# insert into test values (generate_series(1,1000000));
> -- [after executing query immediately disconnect the ethernet on remote
> server ]
> 2017-04-20 07:10:51.588 IST [10467] ERROR:  canceling statement due to
> statement timeout
> 2017-04-20 07:10:51.588 IST [10467] STATEMENT:  insert into test values
> (generate_series(1,1000000));
> 2017-04-20 07:11:25.590 IST [10467] WARNING:  discarding connection 0xfe4260
> because of failed cancel request: PQcancel() -- connect() failed: No route
> to host
> WARNING:  discarding connection 0xfe4260 because of failed cancel request:
> PQcancel() -- connect() failed: No route to host
>
> ERROR:  canceling statement due to statement timeout
> Time: 40001.765 ms (00:40.002)
> postgres=#
> postgres=#
>
> In above case, I got the error related to statement timeout after 6 seconds,
> but it it taking more time (approx 34 sec and it is varing each time if you
> see below) to terminate the connection and to exit from query. As per the
> tcp keepavlies settings for foreign server, it should take max 6 sec to
> terminate the connection.

The logs above show that 34 seconds elapsed between starting to abort
the transaction and knowing that the foreign server is unreachable. It
looks like it took that much time for the local server to realise that
the foreign server is not reachable. Looking at PQcancel code, it
seems to be trying to connect to the foreign server to cancel the
query. But somehow it doesn't seem to honor connect_timeout setting.
Is that expected?

Irrespective of what PQcancel does, it looks like postgres_fdw should
just slam the connection if query is being aborted because of
statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
a way to know whether this ABORT is because of user's request to
cancel the query, statement timeout, an abort because of some other
error or a user requested abort. Except statement timeout (may be
user's request to cancel the query?), it should try to keep the
connection around to avoid any future reconnection. But I am not able
to see how can we provide that information to pgfdw_xact_callback().

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] statement_timeout is not working as expected withpostgres_fdw

From
Kyotaro HORIGUCHI
Date:
Hello,

At Thu, 20 Apr 2017 19:57:30 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRdMD25Lia_J0P1eiKzNXAXtH9AGNx2vCZ2dBPNffNOaog@mail.gmail.com>
> On Thu, Apr 20, 2017 at 4:08 PM, Suraj Kharage
> <suraj.kharage@enterprisedb.com> wrote:
> > Hello All,
> >
> > The query on foreign table hangs due to network down of remote server for
> > near about 16 minutes before exiting.
> > statement_timeout is expected to work in this case as well but when i tried
> > statement_timeout, it is not working as expected.
> >
> > Below is test case to reproduce the issue: [I am testing this on two
> > different systems]
> >
> > 1: Set the postgres_fdw
> >
> > postgres=# CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS
> > (host '172.16.173.237', dbname 'postgres', port '5432',keepalives '1',
> > keepalives_interval '3',keepalives_idle '3', keepalives_count '1');
> > CREATE SERVER
> > postgres=# CREATE USER MAPPING FOR postgres SERVER myserver OPTIONS (user
> > 'postgres', password 'edb');
> > CREATE USER MAPPING
> > postgres=# CREATE FOREIGN TABLE test(id int) SERVER myserver;
> > CREATE FOREIGN TABLE
> > postgres=# select * from test;
> >  id
> > ----
> >   1
> >  10
> >   2
> >   1
> > (4 rows)
> >
> > postgres=# \d
> >             List of relations
> >  Schema | Name |     Type      |  Owner
> > --------+------+---------------+----------
> >  public | test | foreign table | postgres
> > (1 row)
> > postgres=#
> > postgres=# select * from pg_foreign_server ;
> >  srvname  | srvowner | srvfdw | srvtype | srvversion | srvacl |
> > srvoptions
> >
> >
----------+----------+--------+---------+------------+--------+------------------------------------------------------------------------------------------------------------------
> > -------
> >  myserver |       10 |  16387 |         |            |        |
> > {host=172.16.173.237,dbname=postgres,port=5432,keepalives=1,keepalives_interval=3,keepalives_idle=3,keepalives_co
> > unt=1}
> > (1 row)
> >
> > 3. Execute the insert command.
> > E.g: insert into test values (generate_series(1,1000000));
> > * You need to do network down of remote server during insert command.
> >
> > postgres=# set statement_timeout to 6000;
> > SET
> > postgres=# insert into test values (generate_series(1,1000000));
> > WARNING:  server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> >
> > CONTEXT:  Remote SQL command: ABORT TRANSACTION
> > ERROR:  could not receive data from server: No route to host
> >
> > CONTEXT:  Remote SQL command: INSERT INTO public.test(id) VALUES ($1)
> > Time: 931427.002 ms
> >
> > It was in hang state for approx 15-16 minute before exit.
> >
> > In pg_log, i see below error for above query:
> > =========
> > 2017-04-20 15:22:02 IST ERROR:  could not receive data from server: No route
> > to host
> > 2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: INSERT INTO
> > public.test(id) VALUES ($1)
> > 2017-04-20 15:22:02 IST STATEMENT:  insert into test values
> > (generate_series(1,1000000));
> > 2017-04-20 15:22:02 IST WARNING:  server closed the connection unexpectedly
> > This probably means the server terminated abnormally
> > before or while processing the request.
> > 2017-04-20 15:22:02 IST CONTEXT:  Remote SQL command: ABORT TRANSACTION
> > ==========
> >
> > I tested the patch submitted by Ashutosh Bapat on community
> > [https://www.postgresql.org/message-id/CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g%40mail.gmail.com]
> > to make the statement_timeout working and i can see statement_timeout is
> > working but it is taking some extra time as statement_timeout.
> >
> 
> Thanks for testing the patch.
> 
> > postgres=# set statement_timeout to 6000;
> > SET
> > postgres=#
> > postgres=# \timing on
> > Timing is on.
> > postgres=# insert into test values (generate_series(1,1000000));
> > -- [after executing query immediately disconnect the ethernet on remote
> > server ]
> > 2017-04-20 07:10:51.588 IST [10467] ERROR:  canceling statement due to
> > statement timeout
> > 2017-04-20 07:10:51.588 IST [10467] STATEMENT:  insert into test values
> > (generate_series(1,1000000));
> > 2017-04-20 07:11:25.590 IST [10467] WARNING:  discarding connection 0xfe4260
> > because of failed cancel request: PQcancel() -- connect() failed: No route
> > to host
> > WARNING:  discarding connection 0xfe4260 because of failed cancel request:
> > PQcancel() -- connect() failed: No route to host
> >
> > ERROR:  canceling statement due to statement timeout
> > Time: 40001.765 ms (00:40.002)
> > postgres=#
> > postgres=#
> >
> > In above case, I got the error related to statement timeout after 6 seconds,
> > but it it taking more time (approx 34 sec and it is varing each time if you
> > see below) to terminate the connection and to exit from query. As per the
> > tcp keepavlies settings for foreign server, it should take max 6 sec to
> > terminate the connection.
> 
> The logs above show that 34 seconds elapsed between starting to abort
> the transaction and knowing that the foreign server is unreachable. It
> looks like it took that much time for the local server to realise that
> the foreign server is not reachable. Looking at PQcancel code, it
> seems to be trying to connect to the foreign server to cancel the
> query. But somehow it doesn't seem to honor connect_timeout setting.
> Is that expected?

Yes, and No. I think PQcancel requires connection timeout, but I
think it is not necessariry the same with that of a foreign
server.

> Irrespective of what PQcancel does, it looks like postgres_fdw should
> just slam the connection if query is being aborted because of
> statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
> a way to know whether this ABORT is because of user's request to
> cancel the query, statement timeout, an abort because of some other
> error or a user requested abort. Except statement timeout (may be
> user's request to cancel the query?), it should try to keep the
> connection around to avoid any future reconnection. But I am not able
> to see how can we provide that information to pgfdw_xact_callback().

Expiration of statement_timeout doesn't mean a stall of foreign
connections. If we are to keep connections by, for example, a
cancel request from client, we also should keep them on
statememt_timeout because it is not necessariry triggered by a
stall of foreign connection.

I think we can detect a stall of the channel where the foreign
connections are on by a cancel request with a very short timeout,
although it is a bit incorrect.

I reconsider this problem and my proposal for this issue is as
the follows.

- Foreign servers have a new options 'stall_detection_threshold' in milliseconds, maybe defaults to connect_timeout of
theforeign server setting. For many foreign servers in a local network, it could be lowered to several tens of
milliseconds.

-+ Invoke PQcancel with the stall_detection_threashold. (It| doesn't accept such parameter so far)|+- If PQcancel
returnswith timeout, slam the connection down.|+- Else we continue to send ABORT TRANSACTION to the connection.
 


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

From
Ashutosh Bapat
Date:
On Tue, Apr 25, 2017 at 1:31 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>>
>> The logs above show that 34 seconds elapsed between starting to abort
>> the transaction and knowing that the foreign server is unreachable. It
>> looks like it took that much time for the local server to realise that
>> the foreign server is not reachable. Looking at PQcancel code, it
>> seems to be trying to connect to the foreign server to cancel the
>> query. But somehow it doesn't seem to honor connect_timeout setting.
>> Is that expected?
>
> Yes, and No. I think PQcancel requires connection timeout, but I
> think it is not necessariry the same with that of a foreign
> server.

Since connect_timeout is property of foreign server, it should be
honored by any connection made to that server from local server
including the one by PQcancel().

>
>> Irrespective of what PQcancel does, it looks like postgres_fdw should
>> just slam the connection if query is being aborted because of
>> statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
>> a way to know whether this ABORT is because of user's request to
>> cancel the query, statement timeout, an abort because of some other
>> error or a user requested abort. Except statement timeout (may be
>> user's request to cancel the query?), it should try to keep the
>> connection around to avoid any future reconnection. But I am not able
>> to see how can we provide that information to pgfdw_xact_callback().
>
> Expiration of statement_timeout doesn't mean a stall of foreign
> connections. If we are to keep connections by, for example, a
> cancel request from client, we also should keep them on
> statememt_timeout because it is not necessariry triggered by a
> stall of foreign connection.

When statement_timeout completes, we don't want to spend more time in
trying to cancel queries: esp when there are many foreign server, each
consuming some "timeout" time OR even trying to send Abort transaction
statement. Instead, we should slam those down. I consider this to be
different from query cancellation since query cancellation doesn't
have a hard bound on time, although we would like to cancel the
running query as fast as possible. Rethinking about it, probably we
should slam down the connection in case of query cancel as well.
>
> I think we can detect a stall of the channel where the foreign
> connections are on by a cancel request with a very short timeout,
> although it is a bit incorrect.
>
> I reconsider this problem and my proposal for this issue is as
> the follows.
>
> - Foreign servers have a new options 'stall_detection_threshold'
>   in milliseconds, maybe defaults to connect_timeout of the
>   foreign server setting. For many foreign servers in a local
>   network, it could be lowered to several tens of milliseconds.

A connect_timeout less than 2 seconds is not encouraged.
https://www.postgresql.org/docs/devel/static/libpq-connect.html. So,
we can not set stall_detection_threshold to be smaller than 2 seconds.
statement_timeout however is set in milliseconds, so 2 seconds per
connection would be quite a lot compared to statement_timeout setting.
Waiting to cancel query for 2 seconds when the statement_timeout
itself is 2 seconds would mean the query would be cancelled after 4
seconds, which is kind of funny.



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] statement_timeout is not working as expected withpostgres_fdw

From
Kyotaro HORIGUCHI
Date:
At Tue, 25 Apr 2017 15:43:59 +0530, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote in
<CAFjFpRe6uf3-nU-orTEAJ0_CKb3MMiPQ5AHJ_SwDguV7Sjs6Ww@mail.gmail.com>
> On Tue, Apr 25, 2017 at 1:31 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >>
> >> The logs above show that 34 seconds elapsed between starting to abort
> >> the transaction and knowing that the foreign server is unreachable. It
> >> looks like it took that much time for the local server to realise that
> >> the foreign server is not reachable. Looking at PQcancel code, it
> >> seems to be trying to connect to the foreign server to cancel the
> >> query. But somehow it doesn't seem to honor connect_timeout setting.
> >> Is that expected?
> >
> > Yes, and No. I think PQcancel requires connection timeout, but I
> > think it is not necessariry the same with that of a foreign
> > server.
> 
> Since connect_timeout is property of foreign server, it should be
> honored by any connection made to that server from local server
> including the one by PQcancel().

The two connections have different characteristics. A query
(client) connection should be finally established. On the other
we may give up to establish a cancel connection in certain cases.
For example, when a network stall is anticipated, we can rather
choose to trash the session on it than waiting for the timeout.

> >> Irrespective of what PQcancel does, it looks like postgres_fdw should
> >> just slam the connection if query is being aborted because of
> >> statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
> >> a way to know whether this ABORT is because of user's request to
> >> cancel the query, statement timeout, an abort because of some other
> >> error or a user requested abort. Except statement timeout (may be
> >> user's request to cancel the query?), it should try to keep the
> >> connection around to avoid any future reconnection. But I am not able
> >> to see how can we provide that information to pgfdw_xact_callback().
> >
> > Expiration of statement_timeout doesn't mean a stall of foreign
> > connections. If we are to keep connections by, for example, a
> > cancel request from client, we also should keep them on
> > statememt_timeout because it is not necessariry triggered by a
> > stall of foreign connection.
> 
> When statement_timeout completes, we don't want to spend more time in
> trying to cancel queries: esp when there are many foreign server, each
> consuming some "timeout" time OR even trying to send Abort transaction
> statement. Instead, we should slam those down. I consider this to be
> different from query cancellation since query cancellation doesn't
> have a hard bound on time, although we would like to cancel the
> running query as fast as possible. Rethinking about it, probably we
> should slam down the connection in case of query cancel as well.

Yeah, both are rather unusual, in other words, we don't expect it
to happen twice or more in a short interval. With that
prerequisite, I think slamming all down is optimal. But
defenitely not for the commanded rollback case.

> > I think we can detect a stall of the channel where the foreign
> > connections are on by a cancel request with a very short timeout,
> > although it is a bit incorrect.
> >
> > I reconsider this problem and my proposal for this issue is as
> > the follows.
> >
> > - Foreign servers have a new options 'stall_detection_threshold'
> >   in milliseconds, maybe defaults to connect_timeout of the
> >   foreign server setting. For many foreign servers in a local
> >   network, it could be lowered to several tens of milliseconds.
> 
> A connect_timeout less than 2 seconds is not encouraged.
> https://www.postgresql.org/docs/devel/static/libpq-connect.html. So,
> we can not set stall_detection_threshold to be smaller than 2 seconds.

It seems coming from DNS lookup timeout so users can set more
short timeout. Anyway the several-tens or hundres millisecond is
an extreme example. But 2 seconds for every connection would be
too long. Anyway the resolution is 1 second..

> statement_timeout however is set in milliseconds, so 2 seconds per
> connection would be quite a lot compared to statement_timeout setting.
> Waiting to cancel query for 2 seconds when the statement_timeout
> itself is 2 seconds would mean the query would be cancelled after 4
> seconds, which is kind of funny.

Yes, we can keep connections in the cases but in turn it can
complel a user to wait for incrediblly long time for a cancel
request. We could reduce the time to wait with some complex stuff
but I don't think we can cancel queries to hundreds of foreign
servers in a second without asynchronisity or parallelism. It
seems to me unreasonablly complex.

In short, I'd like to just discard all connections by the two
kind of triggers.

I suppose that we can inform the cause to the transaction
callback as an event, but I haven't realize how
AbortCurrentTransaction can detect the cause. Letting
ProcessInterrupts set an additional global variable such like
requested_immediate_abort could work.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




On Thu, Apr 20, 2017 at 10:27 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> The logs above show that 34 seconds elapsed between starting to abort
> the transaction and knowing that the foreign server is unreachable. It
> looks like it took that much time for the local server to realise that
> the foreign server is not reachable. Looking at PQcancel code, it
> seems to be trying to connect to the foreign server to cancel the
> query. But somehow it doesn't seem to honor connect_timeout setting.
> Is that expected?

Well, there's no code to do anything else.  Regular connections go
through connectDBComplete() which uses non-blocking mode and timed
waits to respect connection_timeout.  PQcancel() has no such handling.
internal_cancel just opens a socket and, without setting any options
on it, calls connect().  No loop, no timed waits, nada.  So it's going
to take as long as it takes for the operating system to notice.

> Irrespective of what PQcancel does, it looks like postgres_fdw should
> just slam the connection if query is being aborted because of
> statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
> a way to know whether this ABORT is because of user's request to
> cancel the query, statement timeout, an abort because of some other
> error or a user requested abort. Except statement timeout (may be
> user's request to cancel the query?), it should try to keep the
> connection around to avoid any future reconnection. But I am not able
> to see how can we provide that information to pgfdw_xact_callback().

I don't think we can.  In general, PostgreSQL has no facility for
telling error cleanup handlers why the abort happened, and it wouldn't
really be the right thing anyway.  The fact that statement_timeout
fired doesn't necessarily mean that the connection is dead; it could
equally well mean that the query ran for a long time.  I think we
basically have two choices.  One is to bound the amount of time we
spend performing error cleanup, and the other is just to always drop
the connection.  A problem with the latter is that it might do the
wrong thing if we're aborting a subtransaction but not the whole
transaction.  In that case, we need to undo changes since the relevant
savepoint, but not the ones made before that; closing the connection
amounts to a full rollback.

Therefore, I think the patch you proposed in
https://www.postgresql.org/message-id/CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g%40mail.gmail.com
isn't correct, because if the cancel fails it will slam the connection
shut regardless of whether we're in pgfdw_xact_callback or
pgfdw_subxact_callback.

It looks to me like there's a fairly lengthy list of conceptually
separate but practically related problems here:

1. PQcancel() ignores the keepalive parameters, because it makes no
attempt to set the relevant socket options before connecting.

2. PQcancel() ignores connection_timeout, because it doesn't use
non-blocking mode and has no wait loop.

3. There is no asynchronous equivalent of PQcancel(), so we can't use
a loop to allow responding to interrupts while the cancel is
outstanding (see pgfdw_get_result for an example).

4. pgfdw_xact_callback() and pgfdw_subxact_callback() use PQexec()
rather than the asynchronous interfaces for sending queries and
checking for results, so the SQL commands they send are not
interruptible.

5. pgfdw_xact_callback() and pgfdw_subxact_callback() try to get the
connection back to a good state by using ABORT TRANSACTION for the
former and ROLLBACK TO SAVEPOINT and RELEASE SAVEPOINT for the latter.
But if it doesn't work, then they just emit a WARNING and continue on
as if they'd succeeded.  That seems highly likely to make the next use
of that connection fail in unpredictable ways.

6. postgres_fdw doesn't use PQsetnonblocking() anywhere, so even if we
converted pgfdw_xact_callback() and pgfdw_subxact_callback() to use
pgfdw_exec_query() or something like it rather than PQexec() to submit
queries, they might still block if we fail to send the query, as the
comments for pgfdw_exec_query() explain.  This is not possibly but not
particularly likely to happen for queries being sent out of error
handling paths, because odds are good that the connection was sent
just before.  However, if the user is pressing ^C because the remote
server isn't responding, it's quite probable that we'll run into this
exact issue.

7. postgres_fdw never considers slamming the connection shut as a
response to trouble.  It seems pretty clear that this is only a safe
response if we're aborting the toplevel transaction.  If we did it for
a subtransaction, we'd end up reconnecting if the server were accessed
again, which would at the very least change the snapshot (note that we
use at least REPEATABLE READ on the remote side regardless of the
local isolation level) and might discard changes made on the remote
side at outer transaction levels.  Even for a top-level transaction,
it might not always be the right way to proceed, as it incurs some
overhead to reconnect, but it seems worth considering at least in the
case where we've already got some indication that the connection is
unhealthy (e.g. a previous attempt to clean up the connection state
failed, as in #5, or PQcancel failed, as in Ashutosh's proposed
patch).

8. Before 9.6, PQexec() is used in many more places, so many more
queries are entirely non-interruptible.

It seems pretty clear to me that we are not going to fix all of these
issues in one patch.  Here's a sketch of an idea for how to start
making things better:

- Add an in_abort_cleanup flag to ConnCacheEntry.

- Before pgfdw_xact_callback() or pgfdw_subxact_callback() begin abort
cleanup, check whether the flag is set.  If so, slam the connection
shut unless that's already been done; furthermore, if the flag is set
and we're in pgfdw_xact_callback (i.e. this is a toplevel abort),
forget about the connection entry entirely.  On the other hand, if the
flag is not set, set it flag and attempt abort cleanup.  If we
succeed, clear the flag.

- Before pgfdw_xact_callback() or pgfdw_subxact_callback() begin
pre-commit processing, check whether the flag is set.  If so, throw an
ERROR, so that we switch over to abort processing.

- Change uses of PQexec() in the abort path to use pgfdw_exec_query()
instead.  If we exit pgfdw_exec_query() with an error, we'll re-enter
the abort path, but now in_abort_cleanup will be set, so we'll just
drop the connection (and force any outer transaction levels to abort
as well).

- For bonus points, give pgfdw_exec_query() an optional timeout
argument, and set it to 30 seconds or so when we're doing abort
cleanup.  If the timeout succeeds, it errors out (which again
re-enters the abort path, dropping the connection and forcing outer
transaction levels to abort as well).

Assuming the above works out, I propose to back-patch
f039eaac7131ef2a4cf63a10cf98486f8bcd09d2,
1b812afb0eafe125b820cc3b95e7ca03821aa675, and the changes above to all
supported releases.  That would fix problems 4, 5, 7, and 8 from the
above list for all branches.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Wed, May 3, 2017 at 3:49 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> It seems pretty clear to me that we are not going to fix all of these
> issues in one patch.  Here's a sketch of an idea for how to start
> making things better:

Patch attached.

> - Add an in_abort_cleanup flag to ConnCacheEntry.

Ended up renaming this to abort_cleanup_incomplete.

> - Before pgfdw_xact_callback() or pgfdw_subxact_callback() begin abort
> cleanup, check whether the flag is set.  If so, slam the connection
> shut unless that's already been done; furthermore, if the flag is set
> and we're in pgfdw_xact_callback (i.e. this is a toplevel abort),
> forget about the connection entry entirely.  On the other hand, if the
> flag is not set, set it flag and attempt abort cleanup.  If we
> succeed, clear the flag.

Did approximately this.  It turned out not to be necessary to add any
new calls to PQfinish(); the existing one was adequate.

> - Before pgfdw_xact_callback() or pgfdw_subxact_callback() begin
> pre-commit processing, check whether the flag is set.  If so, throw an
> ERROR, so that we switch over to abort processing.

Did this.

> - Change uses of PQexec() in the abort path to use pgfdw_exec_query()
> instead.  If we exit pgfdw_exec_query() with an error, we'll re-enter
> the abort path, but now in_abort_cleanup will be set, so we'll just
> drop the connection (and force any outer transaction levels to abort
> as well).

Created a new function pgfdw_exec_cleanup_query() for this, also
incorporating a timeout.

Also fixed things so that after issuing PQcancel() we actually throw
away the pending result from the cancelled query, and added some error
recursion protection.

Review would be appreciated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On 05/04/2017 08:01 AM, Robert Haas wrote:
> Patch attached.

I tried at my end after applying the patch against PG HEAD,

Case 1 - without setting statement_timeout i.e default

X machine -
create table test1(a int);

Y machine -
CREATE SERVER myserver_ppas FOREIGN DATA WRAPPER postgres_fdw OPTIONS 
(host 'X', dbname 'postgres', port '5432', connect_timeout '3');
CREATE USER MAPPING FOR centos SERVER myserver_ppas OPTIONS (user 
'centos', password 'adminedb');
create foreign table ft_test_ppas (a  int ) server myserver_ppas options 
(table_name 'test1'); statement_timeout =0;
\timing
insert  into ft_test_ppas  values  (generate_series(1,10000000));

X machine-
disconnect network

Y machine -
postgres=# insert  into ft_test_ppas  values (generate_series(1,10000000));
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
^CCancel request sent
WARNING:  could not send cancel request: PQcancel() -- connect() failed: 
Connection timed out

ERROR:  canceling statement due to user request
Time: 81073.872 ms (01:21.074)

Case 2- when statement_timeout=6000

Y machine -
CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 
'X', dbname 'postgres', port '5432',keepalives '1', keepalives_interval 
'3',keepalives_idle '3', keepalives_count '1');
CREATE USER MAPPING FOR centos SERVER myserver OPTIONS (user 'centos', 
password 'adminedb');
create foreign table ft_test_ppas1 (a  int ) server myserver options 
(table_name 'test1');
set statement_timeout=6000;
\timing
insert  into ft_test_ppas1  values  (generate_series(1,10000000));

X machine-
disconnect network

Y machine
postgres=# insert  into ft_test_ppas1  values 
(generate_series(1,10000000));
WARNING:  could not send cancel request: PQcancel() -- connect() failed: 
Connection timed out

ERROR:  canceling statement due to statement timeout
Time: 69009.875 ms (01:09.010)
postgres=#

Case 3-when statement_timeout=20000

Y machine -
CREATE SERVER myserver FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 
'X', dbname 'postgres', port '5432',keepalives '1', keepalives_interval 
'3',keepalives_idle '3', keepalives_count '1');
CREATE USER MAPPING FOR centos SERVER myserver OPTIONS (user 'centos', 
password 'adminedb');
create foreign table ft_test_ppas1 (a  int ) server myserver options 
(table_name 'test1');
set statement_timeout=20000;
\timing
insert  into ft_test_ppas1  values  (generate_series(1,10000000));

X machine-
disconnect network

Y machine -
postgres=# insert  into ft_test_ppas1  values 
(generate_series(1,10000000));
WARNING:  could not send cancel request: PQcancel() -- connect() failed: 
Connection timed out
ERROR:  canceling statement due to statement timeout
Time: 83014.503 ms (01:23.015)

We can see statement_timeout is working but it is taking some extra 
time,not sure this is an expected behavior in above case or not.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




On 05/04/2017 03:53 PM, tushar wrote:
> We can see statement_timeout is working but it is taking some extra 
> time,not sure this is an expected behavior in above case or not. 
This is only when remote server is involved . in case when both the 
servers are on the same machine , then this is working as expected.

d1=# CREATE SERVER myserver_ppas FOREIGN DATA WRAPPER postgres_fdw 
OPTIONS (host 'localhost', dbname 'postgres', port '5432', 
connect_timeout '3');
CREATE SERVER
d1=# CREATE USER MAPPING FOR centos SERVER myserver_ppas OPTIONS (user 
'centos', password 'adminedb');
CREATE USER MAPPING
d1=# create foreign table ft_test_ppas (a  int ) server myserver_ppas 
options (table_name 'test1');
CREATE FOREIGN TABLE
d1=#
d1=# insert into ft_test_ppas values (1);
INSERT 0 1
Case 1-
d1=# \timing
Timing is on.
d1=# set statement_timeout =6000;
SET
Time: 0.360 ms
d1=# insert  into ft_test_ppas  values (generate_series(1,10000000));
ERROR:  canceling statement due to statement timeout
Time: 6002.509 ms (00:06.003)
d1=#
Case 2 -
d1=# set statement_timeout =20000;
SET
Time: 0.693 ms
d1=# insert  into ft_test_ppas  values (generate_series(1,10000000));
ERROR:  canceling statement due to statement timeout
Time: 20001.741 ms (00:20.002)
d1=#

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company




On Thu, May 4, 2017 at 6:23 AM, tushar <tushar.ahuja@enterprisedb.com> wrote:
> We can see statement_timeout is working but it is taking some extra time,not
> sure this is an expected behavior in above case or not.

Yeah, that's expected.  To fix that, we'd need libpq to have an async
equivalent of PQcancel(), which doesn't currently exist.  I think it
would be a good idea to add that, but that's another discussion.  With
this patch:

1. If postgres_fdw is waiting for one of the cleanup queries to
execute, you can cancel it, and things like statement_timeout will
also work.

2. If the cancel fails or any of the cleanup queries fail,
postgres_fdw will forcibly close the connection and force the current
transaction and all subtransactions to abort.  This isn't wonderful
behavior, but if we can't talk to the remote server any more there
doesn't seem to be any other real alternative.  (We could improve
this, I think, by tracking whether the connection had ever been use by
an outer transaction level, and if not, allowing the transaction to
commit if it never again tried to access the failed connection, but I
left the design and installation of such a mechanism to a future
patch.)

3. But if you're stuck trying to send the cancel request itself, you
still have to wait for that to fail before anything else happens.
Once it does, we'll proceed as outlined in point #2.  This stinks, but
it's the inevitable consequence of having no async equivalent of
PQcancel().

My goal is to make things noticeably better with this patch, not to
fix them completely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Thu, May 4, 2017 at 1:19 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 20, 2017 at 10:27 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> The logs above show that 34 seconds elapsed between starting to abort
>> the transaction and knowing that the foreign server is unreachable. It
>> looks like it took that much time for the local server to realise that
>> the foreign server is not reachable. Looking at PQcancel code, it
>> seems to be trying to connect to the foreign server to cancel the
>> query. But somehow it doesn't seem to honor connect_timeout setting.
>> Is that expected?
>
> Well, there's no code to do anything else.  Regular connections go
> through connectDBComplete() which uses non-blocking mode and timed
> waits to respect connection_timeout.  PQcancel() has no such handling.
> internal_cancel just opens a socket and, without setting any options
> on it, calls connect().  No loop, no timed waits, nada.  So it's going
> to take as long as it takes for the operating system to notice.
>
>> Irrespective of what PQcancel does, it looks like postgres_fdw should
>> just slam the connection if query is being aborted because of
>> statement_timeout. But then pgfdw_xact_callback() doesn't seem to have
>> a way to know whether this ABORT is because of user's request to
>> cancel the query, statement timeout, an abort because of some other
>> error or a user requested abort. Except statement timeout (may be
>> user's request to cancel the query?), it should try to keep the
>> connection around to avoid any future reconnection. But I am not able
>> to see how can we provide that information to pgfdw_xact_callback().
>
> I don't think we can.  In general, PostgreSQL has no facility for
> telling error cleanup handlers why the abort happened, and it wouldn't
> really be the right thing anyway.  The fact that statement_timeout
> fired doesn't necessarily mean that the connection is dead; it could
> equally well mean that the query ran for a long time.  I think we
> basically have two choices.  One is to bound the amount of time we
> spend performing error cleanup, and the other is just to always drop
> the connection.  A problem with the latter is that it might do the
> wrong thing if we're aborting a subtransaction but not the whole
> transaction.  In that case, we need to undo changes since the relevant
> savepoint, but not the ones made before that; closing the connection
> amounts to a full rollback.
>
> Therefore, I think the patch you proposed in
> https://www.postgresql.org/message-id/CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g%40mail.gmail.com
> isn't correct, because if the cancel fails it will slam the connection
> shut regardless of whether we're in pgfdw_xact_callback or
> pgfdw_subxact_callback.
>
> It looks to me like there's a fairly lengthy list of conceptually
> separate but practically related problems here:
>
> 1. PQcancel() ignores the keepalive parameters, because it makes no
> attempt to set the relevant socket options before connecting.
>
> 2. PQcancel() ignores connection_timeout, because it doesn't use
> non-blocking mode and has no wait loop.
>
> 3. There is no asynchronous equivalent of PQcancel(), so we can't use
> a loop to allow responding to interrupts while the cancel is
> outstanding (see pgfdw_get_result for an example).
>
> 4. pgfdw_xact_callback() and pgfdw_subxact_callback() use PQexec()
> rather than the asynchronous interfaces for sending queries and
> checking for results, so the SQL commands they send are not
> interruptible.
>
> 5. pgfdw_xact_callback() and pgfdw_subxact_callback() try to get the
> connection back to a good state by using ABORT TRANSACTION for the
> former and ROLLBACK TO SAVEPOINT and RELEASE SAVEPOINT for the latter.
> But if it doesn't work, then they just emit a WARNING and continue on
> as if they'd succeeded.  That seems highly likely to make the next use
> of that connection fail in unpredictable ways.
>

In pgfdw_xact_callback, if the execution of ABORT TRANSACTION fails
due to any reason then I think it will close the connection.  The
relavant code is:
if (PQstatus(entry->conn) != CONNECTION_OK ||
PQtransactionStatus(entry->conn) != PQTRANS_IDLE)

Basically, if abort transaction fails then transaction status won't be
PQTRANS_IDLE.  Also, does it matter if pgfdw_subxact_callback fails
during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
to execute rollback to savepoint command before proceeding and that
should be good enough?

About patch:

+ /* Rollback all remote subtransactions during abort */
+ snprintf(sql, sizeof(sql),
+ "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
+ curlevel, curlevel);
+ if (!pgfdw_exec_cleanup_query(entry->conn, sql))
+ abort_cleanup_failure = true;

If the above execution fails due to any reason, then it won't allow
even committing the mail transaction which I am not sure is the right
thing to do.  I have tried it manually editing the above command in
debugger and the result is as below:

Create a foreign table and try below scenario.

postgres=# begin;
BEGIN
postgres=# savepoint s1;
SAVEPOINT
postgres=# insert into foreign_test values(8);
INSERT 0 1
postgres=# savepoint s2;
SAVEPOINT
postgres=# insert into foreign_test values(generate_series(1,1000000));
Cancel request sent
WARNING:  syntax error at or near "OOLLBACK"
ERROR:  canceling statement due to user request

-- In the debugger, I have changed ROLLBACK to mimic the failure.

postgres=# Rollback to savepoint s2;
ROLLBACK
postgres=# commit;
ERROR:  connection to server "foreign_server" was lost

Considering above, I am not sure if we should consider all failures
during abort of subtransaction to indicate pending cleanup state and
then don't allow further commits.
#include "utils/memutils.h"
-
+#include "utils/syscache.h"

Looks like spurious line removal

> 6. postgres_fdw doesn't use PQsetnonblocking() anywhere, so even if we
> converted pgfdw_xact_callback() and pgfdw_subxact_callback() to use
> pgfdw_exec_query() or something like it rather than PQexec() to submit
> queries, they might still block if we fail to send the query, as the
> comments for pgfdw_exec_query() explain.  This is not possibly but not
> particularly likely to happen for queries being sent out of error
> handling paths, because odds are good that the connection was sent
> just before.  However, if the user is pressing ^C because the remote
> server isn't responding, it's quite probable that we'll run into this
> exact issue.
>
> 7. postgres_fdw never considers slamming the connection shut as a
> response to trouble.  It seems pretty clear that this is only a safe
> response if we're aborting the toplevel transaction.  If we did it for
> a subtransaction, we'd end up reconnecting if the server were accessed
> again, which would at the very least change the snapshot (note that we
> use at least REPEATABLE READ on the remote side regardless of the
> local isolation level) and might discard changes made on the remote
> side at outer transaction levels.  Even for a top-level transaction,
> it might not always be the right way to proceed, as it incurs some
> overhead to reconnect, but it seems worth considering at least in the
> case where we've already got some indication that the connection is
> unhealthy (e.g. a previous attempt to clean up the connection state
> failed, as in #5, or PQcancel failed, as in Ashutosh's proposed
> patch).
>




> 8. Before 9.6, PQexec() is used in many more places, so many more
> queries are entirely non-interruptible.
>
> It seems pretty clear to me that we are not going to fix all of these
> issues in one patch.  Here's a sketch of an idea for how to start
> making things better:
>
> - Add an in_abort_cleanup flag to ConnCacheEntry.
>
> - Before pgfdw_xact_callback() or pgfdw_subxact_callback() begin abort
> cleanup, check whether the flag is set.  If so, slam the connection
> shut unless that's already been done; furthermore, if the flag is set
> and we're in pgfdw_xact_callback (i.e. this is a toplevel abort),
> forget about the connection entry entirely.  On the other hand, if the
> flag is not set, set it flag and attempt abort cleanup.  If we
> succeed, clear the flag.
>
> - Before pgfdw_xact_callback() or pgfdw_subxact_callback() begin
> pre-commit processing, check whether the flag is set.  If so, throw an
> ERROR, so that we switch over to abort processing.
>
> - Change uses of PQexec() in the abort path to use pgfdw_exec_query()
> instead.  If we exit pgfdw_exec_query() with an error, we'll re-enter
> the abort path, but now in_abort_cleanup will be set, so we'll just
> drop the connection (and force any outer transaction levels to abort
> as well).
>
> - For bonus points, give pgfdw_exec_query() an optional timeout
> argument, and set it to 30 seconds or so when we're doing abort
> cleanup.  If the timeout succeeds, it errors out (which again
> re-enters the abort path, dropping the connection and forcing outer
> transaction levels to abort as well).
>

Why do we need this 30 seconds timeout for abort cleanup failures?
Isn't it sufficient if we propagate the error only on failures?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Thu, May 4, 2017 at 7:13 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> In pgfdw_xact_callback, if the execution of ABORT TRANSACTION fails
> due to any reason then I think it will close the connection.  The
> relavant code is:
> if (PQstatus(entry->conn) != CONNECTION_OK ||
> PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
>
> Basically, if abort transaction fails then transaction status won't be
> PQTRANS_IDLE.

I don't think that's necessarily true.  Look at this code:
            /* Rollback all remote subtransactions during abort */            snprintf(sql, sizeof(sql),
    "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",                     curlevel, curlevel);            res =
PQexec(entry->conn,sql);            if (PQresultStatus(res) != PGRES_COMMAND_OK)
pgfdw_report_error(WARNING,res, entry->conn, true, sql);            else                PQclear(res);
 

If that sql command somehow returns a result status other than
PGRES_COMMAND_OK, like say due to a local OOM failure or whatever, the
connection is PQTRANS_IDLE but the rollback wasn't actually done.

> Also, does it matter if pgfdw_subxact_callback fails
> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
> to execute rollback to savepoint command before proceeding and that
> should be good enough?

I don't understand this.  If pgfdw_subxact_callback doesn't roll the
remote side back to the savepoint, how is it going to get done later?
There's no code in postgres_fdw other than that code to issue the
rollback.

> About patch:
>
> + /* Rollback all remote subtransactions during abort */
> + snprintf(sql, sizeof(sql),
> + "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
> + curlevel, curlevel);
> + if (!pgfdw_exec_cleanup_query(entry->conn, sql))
> + abort_cleanup_failure = true;
>
> If the above execution fails due to any reason, then it won't allow
> even committing the mail transaction which I am not sure is the right
> thing to do.

Well, as I said in my reply to Tushar, I think it's the only correct
thing to do.  Suppose you do the following:

(1) make a change to the foreign table
(2) enter a subtransaction
(3) make another change to the foreign table
(4) roll back the subtransaction
(5) commit the transaction

If the remote rollback gets skipped in step 4, it is dead wrong for
step 5 to commit the remote transaction anyway.  Both the change made
in step (1) and the change made in step (3) would get made on the
remote side, which is 100% wrong.  Given the failure of the rollback
in step 4, there is no way to achieve the desired outcome of
committing the step (1) change and rolling back the step (3) change.
The only way forward is to roll back everything - i.e. force a
toplevel abort.

>  #include "utils/memutils.h"
> -
> +#include "utils/syscache.h"
>
> Looks like spurious line removal

OK.

>> - For bonus points, give pgfdw_exec_query() an optional timeout
>> argument, and set it to 30 seconds or so when we're doing abort
>> cleanup.  If the timeout succeeds, it errors out (which again
>> re-enters the abort path, dropping the connection and forcing outer
>> transaction levels to abort as well).
>
> Why do we need this 30 seconds timeout for abort cleanup failures?
> Isn't it sufficient if we propagate the error only on failures?

Well, the problem is that right now we're waiting for vast amounts of
time when the remote connection dies.  First, the user notices that
the command is running for too long and hits ^C.  At that point, the
connection is probably already dead, and the user's been waiting for a
while already.  Then, we wait for PQcancel() to time out.  Then, we
wait for PQexec() to time out.  The result of that, as Suraj mentioned
in the original email, is that it takes about 16 minutes for the
session to recover when the connection dies, even with keepalive
settings and connection timeout set all the way to the minimum.  The
goal here is to make it take a more reasonable time to recover.
Without modifying libpq, we can't do anything about the need to wait
for PQcancel() to time out, but we can improve the rest of it.  If we
removed that 30-second timeout, we would win in the case where either
ABORT TRANSACTION or ROLLBACK; RELEASE eventually succeeds but takes
more than 30 seconds.  In that case, the change you are proposing
would allow us to reuse the connection instead of dropping it, and
would prevent a forced toplevel abort when subtransactions are in use.
However, the cost would be that when one of those commands never
succeeds, we would wait a lot longer before finishing a transaction
abort.  I argue that if the remote server fails to respond to one of
those commands within 30 seconds, it's probably dead or nearly dead,
and we're probably better off just treating it as dead rather than
waiting longer.  That's a judgement call, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Thu, May 4, 2017 at 8:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 4, 2017 at 7:13 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>> - For bonus points, give pgfdw_exec_query() an optional timeout
>>> argument, and set it to 30 seconds or so when we're doing abort
>>> cleanup.  If the timeout succeeds, it errors out (which again
>>> re-enters the abort path, dropping the connection and forcing outer
>>> transaction levels to abort as well).
>>
>> Why do we need this 30 seconds timeout for abort cleanup failures?
>> Isn't it sufficient if we propagate the error only on failures?
>
> Well, the problem is that right now we're waiting for vast amounts of
> time when the remote connection dies.  First, the user notices that
> the command is running for too long and hits ^C.  At that point, the
> connection is probably already dead, and the user's been waiting for a
> while already.  Then, we wait for PQcancel() to time out.  Then, we
> wait for PQexec() to time out.  The result of that, as Suraj mentioned
> in the original email, is that it takes about 16 minutes for the
> session to recover when the connection dies, even with keepalive
> settings and connection timeout set all the way to the minimum.  The
> goal here is to make it take a more reasonable time to recover.
> Without modifying libpq, we can't do anything about the need to wait
> for PQcancel() to time out, but we can improve the rest of it.  If we
> removed that 30-second timeout, we would win in the case where either
> ABORT TRANSACTION or ROLLBACK; RELEASE eventually succeeds but takes
> more than 30 seconds.  In that case, the change you are proposing
> would allow us to reuse the connection instead of dropping it, and
> would prevent a forced toplevel abort when subtransactions are in use.
> However, the cost would be that when one of those commands never
> succeeds, we would wait a lot longer before finishing a transaction
> abort.
>

As soon as the first command fails due to timeout, we will set
'abort_cleanup_failure' which will make a toplevel transaction to
abort and also won't allow other statements to execute.  The patch is
trying to enforce a 30-second timeout after statement execution, so it
has to anyway wait till the command execution completes (irrespective
of whether the command succeeds or fail).  It is quite possible I am
missing some point, is it possible for you to tell in somewhat more
detail in which exact case 30-second timeout will allow us to abort
the toplevel transaction if we already do that in the case of
statement failure case?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Thu, May 4, 2017 at 12:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> As soon as the first command fails due to timeout, we will set
> 'abort_cleanup_failure' which will make a toplevel transaction to
> abort and also won't allow other statements to execute.  The patch is
> trying to enforce a 30-second timeout after statement execution, so it
> has to anyway wait till the command execution completes (irrespective
> of whether the command succeeds or fail).

I don't understand what you mean by this.  If the command doesn't
finish within 30 seconds, we *don't* wait for it to finish.  To avoid
getting inconsistent results in consequence, we set
abort_cleanup_failure.

> It is quite possible I am
> missing some point, is it possible for you to tell in somewhat more
> detail in which exact case 30-second timeout will allow us to abort
> the toplevel transaction if we already do that in the case of
> statement failure case?

Suppose the user's original connection is stuck for some reason but
the postmaster is still accepting new connections - e.g. somebody
attached gdb to the remote server process and it is therefore stopped.
The cancel request gets sent OK, but without the 30-second timeout,
we'll hang forever (or until gdb continues or detaches the processes)
waiting for it to take effect.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Thu, May 4, 2017 at 10:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 4, 2017 at 12:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> As soon as the first command fails due to timeout, we will set
>> 'abort_cleanup_failure' which will make a toplevel transaction to
>> abort and also won't allow other statements to execute.  The patch is
>> trying to enforce a 30-second timeout after statement execution, so it
>> has to anyway wait till the command execution completes (irrespective
>> of whether the command succeeds or fail).
>
> I don't understand what you mean by this.  If the command doesn't
> finish within 30 seconds, we *don't* wait for it to finish.
>

+ /*
+ * Submit a query.  Since we don't use non-blocking mode, this also can
+ * block.  But its risk is relatively small, so we ignore that for now.
+ */
+ if (!PQsendQuery(conn, query))
+ {
+ pgfdw_report_error(WARNING, NULL, conn, false, query);
+ return false;
+ }
+
+ /* Get the result of the query. */
+ if (pgfdw_get_cleanup_result(conn, endtime, &result))
+ return false;

The 30 seconds logic is in function pgfdw_get_cleanup_result, can't
the command hang in PQsendQuery?



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Thu, May 4, 2017 at 1:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, May 4, 2017 at 10:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Thu, May 4, 2017 at 12:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> As soon as the first command fails due to timeout, we will set
>>> 'abort_cleanup_failure' which will make a toplevel transaction to
>>> abort and also won't allow other statements to execute.  The patch is
>>> trying to enforce a 30-second timeout after statement execution, so it
>>> has to anyway wait till the command execution completes (irrespective
>>> of whether the command succeeds or fail).
>>
>> I don't understand what you mean by this.  If the command doesn't
>> finish within 30 seconds, we *don't* wait for it to finish.
>>
>
> + /*
> + * Submit a query.  Since we don't use non-blocking mode, this also can
> + * block.  But its risk is relatively small, so we ignore that for now.
> + */
> + if (!PQsendQuery(conn, query))
> + {
> + pgfdw_report_error(WARNING, NULL, conn, false, query);
> + return false;
> + }
> +
> + /* Get the result of the query. */
> + if (pgfdw_get_cleanup_result(conn, endtime, &result))
> + return false;
>
> The 30 seconds logic is in function pgfdw_get_cleanup_result, can't
> the command hang in PQsendQuery?

Sure.  I thought about trying to fix that too, by using
PQsetnonblocking(), but I thought the patch was doing enough already.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Thu, May 4, 2017 at 8:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 4, 2017 at 7:13 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> In pgfdw_xact_callback, if the execution of ABORT TRANSACTION fails
>> due to any reason then I think it will close the connection.  The
>> relavant code is:
>> if (PQstatus(entry->conn) != CONNECTION_OK ||
>> PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
>>
>> Basically, if abort transaction fails then transaction status won't be
>> PQTRANS_IDLE.
>
> I don't think that's necessarily true.  Look at this code:
>
>              /* Rollback all remote subtransactions during abort */
>              snprintf(sql, sizeof(sql),
>                       "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
>                       curlevel, curlevel);
>              res = PQexec(entry->conn, sql);
>              if (PQresultStatus(res) != PGRES_COMMAND_OK)
>                  pgfdw_report_error(WARNING, res, entry->conn, true, sql);
>              else
>                  PQclear(res);
>
> If that sql command somehow returns a result status other than
> PGRES_COMMAND_OK, like say due to a local OOM failure or whatever, the
> connection is PQTRANS_IDLE but the rollback wasn't actually done.
>

I have tried to verify above point and it seems to me in such a
situation the transaction status will be either PQTRANS_INTRANS or
PQTRANS_INERROR.  I have tried to mimic "out of memory" failure by
making PQmakeEmptyPGresult return NULL (malloc failure).  AFAIU, it
will make the state as PQTRANS_IDLE only when the statement is
executed successfully.

>> Also, does it matter if pgfdw_subxact_callback fails
>> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
>> to execute rollback to savepoint command before proceeding and that
>> should be good enough?
>
> I don't understand this.  If pgfdw_subxact_callback doesn't roll the
> remote side back to the savepoint, how is it going to get done later?
> There's no code in postgres_fdw other than that code to issue the
> rollback.
>

It is done via toplevel transaction ABORT.  I think how it works is
after "ROLLBACK TO SAVEPOINT" failure, any next execution is going to
fail.  Let us say if we issue Commit after failure, it will try to
execute RELEASE SAVEPOINT which will fail and lead to toplevel
transaction ABORT.  Now if the toplevel transaction ABORT also fails,
it will drop the connection.

>> About patch:
>>
>> + /* Rollback all remote subtransactions during abort */
>> + snprintf(sql, sizeof(sql),
>> + "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
>> + curlevel, curlevel);
>> + if (!pgfdw_exec_cleanup_query(entry->conn, sql))
>> + abort_cleanup_failure = true;
>>
>> If the above execution fails due to any reason, then it won't allow
>> even committing the mail transaction which I am not sure is the right
>> thing to do.
>
> Well, as I said in my reply to Tushar, I think it's the only correct
> thing to do.  Suppose you do the following:
>
> (1) make a change to the foreign table
> (2) enter a subtransaction
> (3) make another change to the foreign table
> (4) roll back the subtransaction
> (5) commit the transaction
>
> If the remote rollback gets skipped in step 4, it is dead wrong for
> step 5 to commit the remote transaction anyway.  Both the change made
> in step (1) and the change made in step (3) would get made on the
> remote side, which is 100% wrong.  Given the failure of the rollback
> in step 4, there is no way to achieve the desired outcome of
> committing the step (1) change and rolling back the step (3) change.
> The only way forward is to roll back everything - i.e. force a
> toplevel abort.
>

Agreed, in such a situation toplevel transaction abort is the only way
out.  However, it seems to me that will anyway happen in current code
even if we don't try to force it via abort_cleanup_failure.  I
understand that it might be better to force it as you have done in the
patch, but not sure if it is good to drop the connection where
previously it could have done without it (for ex. if the first
statement Rollback To Savepoint fails, but ABORT succeeds).  I feel it
is worth considering whether such a behavior difference is okay as you
have proposed to backpatch it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Thu, May 4, 2017 at 10:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 4, 2017 at 1:04 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Thu, May 4, 2017 at 10:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Thu, May 4, 2017 at 12:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>> As soon as the first command fails due to timeout, we will set
>>>> 'abort_cleanup_failure' which will make a toplevel transaction to
>>>> abort and also won't allow other statements to execute.  The patch is
>>>> trying to enforce a 30-second timeout after statement execution, so it
>>>> has to anyway wait till the command execution completes (irrespective
>>>> of whether the command succeeds or fail).
>>>
>>> I don't understand what you mean by this.  If the command doesn't
>>> finish within 30 seconds, we *don't* wait for it to finish.
>>>
>>
>> + /*
>> + * Submit a query.  Since we don't use non-blocking mode, this also can
>> + * block.  But its risk is relatively small, so we ignore that for now.
>> + */
>> + if (!PQsendQuery(conn, query))
>> + {
>> + pgfdw_report_error(WARNING, NULL, conn, false, query);
>> + return false;
>> + }
>> +
>> + /* Get the result of the query. */
>> + if (pgfdw_get_cleanup_result(conn, endtime, &result))
>> + return false;
>>
>> The 30 seconds logic is in function pgfdw_get_cleanup_result, can't
>> the command hang in PQsendQuery?
>
> Sure.  I thought about trying to fix that too, by using
> PQsetnonblocking(), but I thought the patch was doing enough already.
>

Okay, it seems better to deal it in a separate patch.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Fri, May 5, 2017 at 1:01 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> I have tried to verify above point and it seems to me in such a
> situation the transaction status will be either PQTRANS_INTRANS or
> PQTRANS_INERROR.  I have tried to mimic "out of memory" failure by
> making PQmakeEmptyPGresult return NULL (malloc failure).  AFAIU, it
> will make the state as PQTRANS_IDLE only when the statement is
> executed successfully.

Hrm.  OK, you might be right, but still I don't see why having the 30
second timeout is bad.  People don't want a transaction rollback to
hang for a long period of time waiting for ABORT TRANSACTION to run if
we could've just dropped the connection to get the same effect.  Sure,
the next transaction will then have to reconnect, but that takes a few
tens of milliseconds; if you wait for one extra second to avoid that
outcome, you come out behind even if you do manage to avoid the
reconnect.  Now for a subtransaction you could argue that it's worth
being more patient, because forcing a toplevel abort sucks.  But what
are the chances that, after failing to respond to a ROLLBACK; RELEASE
SAVEPOINT within 30 seconds, the remote side is going to wake up again
and start behaving normally?  I'd argue that it's not terribly likely
and most people aren't going to want to wait for multiple minutes to
see whether it does, but I suppose we could impose the 30 second
timeout only at the toplevel and let subtransactions wait however long
the operating system takes to time out.

>>> Also, does it matter if pgfdw_subxact_callback fails
>>> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
>>> to execute rollback to savepoint command before proceeding and that
>>> should be good enough?
>>
>> I don't understand this.  If pgfdw_subxact_callback doesn't roll the
>> remote side back to the savepoint, how is it going to get done later?
>> There's no code in postgres_fdw other than that code to issue the
>> rollback.
>>
> It is done via toplevel transaction ABORT.  I think how it works is
> after "ROLLBACK TO SAVEPOINT" failure, any next execution is going to
> fail.  Let us say if we issue Commit after failure, it will try to
> execute RELEASE SAVEPOINT which will fail and lead to toplevel
> transaction ABORT.  Now if the toplevel transaction ABORT also fails,
> it will drop the connection.

Mmmph.  I guess that would work, but I think it's better to track it
explicitly.  I tried this (bar is a foreign table):

begin;
select * from bar;
savepoint s1;
select * from bar;
savepoint s2;
select * from bar;
savepoint s3;
select * from bar;
commit;

On the remote side, the commit statement produces this:

2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s4
2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s3
2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s2
2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: COMMIT TRANSACTION

But that's obviously silly.  Somebody could easily come along and
optimize that by getting rid of the RELEASE SAVEPOINT commands which
are completely unnecessary if we're going to commit the toplevel
transaction anyway, and then the argument that you're making wouldn't
be true any more.  It seems smart to me to explicitly track whether
the remote side is known to be in a bad state rather than relying on
the future sequence of commands to be such that we'll get the right
result anyway.

> Agreed, in such a situation toplevel transaction abort is the only way
> out.  However, it seems to me that will anyway happen in current code
> even if we don't try to force it via abort_cleanup_failure.  I
> understand that it might be better to force it as you have done in the
> patch, but not sure if it is good to drop the connection where
> previously it could have done without it (for ex. if the first
> statement Rollback To Savepoint fails, but ABORT succeeds).  I feel it
> is worth considering whether such a behavior difference is okay as you
> have proposed to backpatch it.

Well, I think the whole point of this patch is that if one command on
a connection fails, the chances of future commands succeeding are
pretty poor.  It's not impossible, but it's not very likely either.
To recap, the problem at present is that if network connectivity is
interrupted and the statement is then killed by statement_timeout or a
local cancel request, we try to send a cancel request to the remote
side, which takes a while to time out.  Even if that fails, we then
send an ABORT TRANSACTION, hoping against hope that it will succeed
despite the failure of the cancel request.  The user ends up waiting
for the cancel request to time out and then for the ABORT TRANSACTION
to time out, which takes 15-16 minutes no matter how you set the
keepalive settings and no matter how many times you hit ^C on the
local side.

Now, if you want to allow for the possibility that some future
operation might succeed never mind past failures, then you're
basically saying that trying the ABORT TRANSACTION even though the
query cancel has failed is the correct behavior.  And if you don't
want the ABORT TRANSACTION to have a 30 second timeout either, then
you're saying we should wait however long it takes the operating
system to detect the failure, which again is the current behavior.  If
we rip those things out of the patch, then there's not much left of
the patch.  The only thing that would left would be making the wait
for ABORT TRANSACTION interruptible, so that if you hit ^C *again*
after you've already canceled the query, we then give up on waiting
for the ABORT TRANSACTION.  I guess that would be an improvement over
the status quo, but it's not really a solution.  People don't expect
to have to send multiple cancel requests to kill the same query in a
timely fashion, and if the commands are being sent by an application
rather than by a human being it probably won't happen.

Do you want to propose a competing patch to fix this problem?  I'm
happy to listen to suggestions.  But I think if you're determined to
say "let's rely entirely on the OS timeouts rather than having any of
our own" and you're also determined to say "let's not assume that a
failure of one operation means that the connection is dead", then
you're basically arguing that there's nothing really broken here, and
I don't agree with that.  Aborting a transaction shouldn't take a
quarter of an hour.

You could perhaps argue that it's better to make only one of the two
changes I'm proposing and not both, but I don't really see it.  If you
make the abort_cleanup_failure changes but don't apply the 30 second
timeout, then suppose the cancel request works but the connection is
dead and the ABORT TRANSACTION command takes five minutes to time out.
Well, if you'd had the timeout, you would have just closed the
connection after 30 seconds and the only thing it would have cost you
is a reconnect the next time that foreign server was used.  Instead,
you waited 5 minutes in the hope of saving a reconnect cycle that
probably would have taken tens of milliseconds or less.  Blech.  On
the other hand, if you make the 30 second timeout changes but not the
abort_cleanup_failure changes, then suppose network connectivity is
entirely interrupted.  Then you'll wait however long it takes for the
cancel to succeed plus another 30 seconds for the ABORT TRANSACTION,
whereas with the patch as proposed you only wait for the first one.
Also blech.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Fri, May 5, 2017 at 7:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, May 5, 2017 at 1:01 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> I have tried to verify above point and it seems to me in such a
>> situation the transaction status will be either PQTRANS_INTRANS or
>> PQTRANS_INERROR.  I have tried to mimic "out of memory" failure by
>> making PQmakeEmptyPGresult return NULL (malloc failure).  AFAIU, it
>> will make the state as PQTRANS_IDLE only when the statement is
>> executed successfully.
>
> Hrm.  OK, you might be right, but still I don't see why having the 30
> second timeout is bad.  People don't want a transaction rollback to
> hang for a long period of time waiting for ABORT TRANSACTION to run if
> we could've just dropped the connection to get the same effect.  Sure,
> the next transaction will then have to reconnect, but that takes a few
> tens of milliseconds; if you wait for one extra second to avoid that
> outcome, you come out behind even if you do manage to avoid the
> reconnect.  Now for a subtransaction you could argue that it's worth
> being more patient, because forcing a toplevel abort sucks.  But what
> are the chances that, after failing to respond to a ROLLBACK; RELEASE
> SAVEPOINT within 30 seconds, the remote side is going to wake up again
> and start behaving normally?  I'd argue that it's not terribly likely
> and most people aren't going to want to wait for multiple minutes to
> see whether it does, but I suppose we could impose the 30 second
> timeout only at the toplevel and let subtransactions wait however long
> the operating system takes to time out.
>
>>>> Also, does it matter if pgfdw_subxact_callback fails
>>>> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
>>>> to execute rollback to savepoint command before proceeding and that
>>>> should be good enough?
>>>
>>> I don't understand this.  If pgfdw_subxact_callback doesn't roll the
>>> remote side back to the savepoint, how is it going to get done later?
>>> There's no code in postgres_fdw other than that code to issue the
>>> rollback.
>>>
>> It is done via toplevel transaction ABORT.  I think how it works is
>> after "ROLLBACK TO SAVEPOINT" failure, any next execution is going to
>> fail.  Let us say if we issue Commit after failure, it will try to
>> execute RELEASE SAVEPOINT which will fail and lead to toplevel
>> transaction ABORT.  Now if the toplevel transaction ABORT also fails,
>> it will drop the connection.
>
> Mmmph.  I guess that would work, but I think it's better to track it
> explicitly.  I tried this (bar is a foreign table):
>
> begin;
> select * from bar;
> savepoint s1;
> select * from bar;
> savepoint s2;
> select * from bar;
> savepoint s3;
> select * from bar;
> commit;
>
> On the remote side, the commit statement produces this:
>
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s4
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s3
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: RELEASE SAVEPOINT s2
> 2017-05-05 09:28:52.597 EDT [80263] LOG:  statement: COMMIT TRANSACTION
>
> But that's obviously silly.  Somebody could easily come along and
> optimize that by getting rid of the RELEASE SAVEPOINT commands which
> are completely unnecessary if we're going to commit the toplevel
> transaction anyway, and then the argument that you're making wouldn't
> be true any more.  It seems smart to me to explicitly track whether
> the remote side is known to be in a bad state rather than relying on
> the future sequence of commands to be such that we'll get the right
> result anyway.
>

Sure, tracking explicitly might be a better way to do deal with it,
but at the cost of always dropping the connection in case of error
might not be the best thing to do.

>> Agreed, in such a situation toplevel transaction abort is the only way
>> out.  However, it seems to me that will anyway happen in current code
>> even if we don't try to force it via abort_cleanup_failure.  I
>> understand that it might be better to force it as you have done in the
>> patch, but not sure if it is good to drop the connection where
>> previously it could have done without it (for ex. if the first
>> statement Rollback To Savepoint fails, but ABORT succeeds).  I feel it
>> is worth considering whether such a behavior difference is okay as you
>> have proposed to backpatch it.
>
> Well, I think the whole point of this patch is that if one command on
> a connection fails, the chances of future commands succeeding are
> pretty poor.  It's not impossible, but it's not very likely either.
> To recap, the problem at present is that if network connectivity is
> interrupted and the statement is then killed by statement_timeout or a
> local cancel request, we try to send a cancel request to the remote
> side, which takes a while to time out.  Even if that fails, we then
> send an ABORT TRANSACTION, hoping against hope that it will succeed
> despite the failure of the cancel request.  The user ends up waiting
> for the cancel request to time out and then for the ABORT TRANSACTION
> to time out, which takes 15-16 minutes no matter how you set the
> keepalive settings and no matter how many times you hit ^C on the
> local side.
>
> Now, if you want to allow for the possibility that some future
> operation might succeed never mind past failures, then you're
> basically saying that trying the ABORT TRANSACTION even though the
> query cancel has failed is the correct behavior.  And if you don't
> want the ABORT TRANSACTION to have a 30 second timeout either, then
> you're saying we should wait however long it takes the operating
> system to detect the failure, which again is the current behavior.  If
> we rip those things out of the patch, then there's not much left of
> the patch.
>

I am not saying to rip those changes.  Your most of the mail stresses
about the 30-second timeout which you have added in the patch to
detect timeouts during failures (rollback processing).  I have no
objection to that part of the patch except that still there is a race
condition around PQsendQuery and you indicate that it is better to
deal it as a separate patch to which I agree.  The point of which I am
not in total agreement is about the failure other than the time out.

+pgfdw_exec_cleanup_query(PGconn *conn, const char *query)
{
..
+ /* Get the result of the query. */
+ if (pgfdw_get_cleanup_result(conn, endtime, &result))
+ return false;
+
+ /* Issue a warning if not successful. */
+ if (PQresultStatus(result) != PGRES_COMMAND_OK)
+ {
+ pgfdw_report_error(WARNING, result, conn, true, query);
+ return false;
+ }
..
}

In the above code, if there is a failure due to timeout then it will
return from first statement (pgfdw_get_cleanup_result), however, if
there is any other failure, then it will issue a warning and return
false.  I am not sure if there is a need to return false in the second
case, because even without that it will work fine (and there is a
chance that it won't drop the connection), but maybe your point is
better to be consistent for all kind of error handling in this path.

>  The only thing that would left would be making the wait
> for ABORT TRANSACTION interruptible, so that if you hit ^C *again*
> after you've already canceled the query, we then give up on waiting
> for the ABORT TRANSACTION.  I guess that would be an improvement over
> the status quo, but it's not really a solution.  People don't expect
> to have to send multiple cancel requests to kill the same query in a
> timely fashion, and if the commands are being sent by an application
> rather than by a human being it probably won't happen.
>
> Do you want to propose a competing patch to fix this problem?  I'm
> happy to listen to suggestions.  But I think if you're determined to
> say "let's rely entirely on the OS timeouts rather than having any of
> our own" and you're also determined to say "let's not assume that a
> failure of one operation means that the connection is dead", then
> you're basically arguing that there's nothing really broken here, and
> I don't agree with that.
>

No, I am not saying any of those.  I hope the previous point clarifies
what I want to say.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Fri, May 5, 2017 at 11:15 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> I am not saying to rip those changes.  Your most of the mail stresses
> about the 30-second timeout which you have added in the patch to
> detect timeouts during failures (rollback processing).  I have no
> objection to that part of the patch except that still there is a race
> condition around PQsendQuery and you indicate that it is better to
> deal it as a separate patch to which I agree.

OK.

> The point of which I am
> not in total agreement is about the failure other than the time out.
>
> +pgfdw_exec_cleanup_query(PGconn *conn, const char *query)
> {
> ..
> + /* Get the result of the query. */
> + if (pgfdw_get_cleanup_result(conn, endtime, &result))
> + return false;
> +
> + /* Issue a warning if not successful. */
> + if (PQresultStatus(result) != PGRES_COMMAND_OK)
> + {
> + pgfdw_report_error(WARNING, result, conn, true, query);
> + return false;
> + }
> ..
> }
>
> In the above code, if there is a failure due to timeout then it will
> return from first statement (pgfdw_get_cleanup_result), however, if
> there is any other failure, then it will issue a warning and return
> false.  I am not sure if there is a need to return false in the second
> case, because even without that it will work fine (and there is a
> chance that it won't drop the connection), but maybe your point is
> better to be consistent for all kind of error handling in this path.

There are three possible queries that can be executed by
pgfdw_exec_cleanup_query; let's consider them individually.

1. ABORT TRANSACTION.  If ABORT TRANSACTION fails, I think that we
have no alternative but to close the connection.  If we don't, then
the next local connection that tries to use this connection will
probably also fail, because it will find the connection inside an
aborted transaction rather than not in a transaction.  If we do close
the connection, the next transaction will try to reconnect and
everything will be fine.

2. DEALLOCATE ALL.  If DEALLOCATE ALL fails, we do not have to close
the connection, but the reason why we're running that statement in the
first place is because we've potentially lost track of which
statements are prepared on the remote side.  If we don't drop the
connection, we'll still be confused about that.  Reconnecting will fix
it.

3. ROLLBACK TO SAVEPOINT %d; RELEASE SAVEPOINT %d.  If this fails, the
local toplevel transaction is doomed.  It would be wrong to try to
commit on the remote side, because there could be remote changes made
by the aborted subtransaction which must not survive, and the ROLLBACK
TO SAVEPOINT %d command is essential in order for us to get rid of
them, and that has already failed.  So we must eventually abort the
remote transaction, which we can do either by closing the connection
or by trying to execute ABORT TRANSACTION.  The former, however, is
quick, and the latter is very possibly going to involve a long hang,
so closing the connection seems better.

In all of these cases, the failure of one of these statements seems to
me to *probably* mean that the remote side is just dead, in which case
dropping the connection is the only option.  But even if the remote
side is still alive and the statement failed for some other reason --
say somebody changed kwlist.h so that ABORT now has to be spelled
ABURT -- returning true rather than false just causes us to end up
with a remote connection whose state is out of step with the local
server state, and such a connection is not going to be usable.  I
think part of your point is that in case 3, we might still get back to
a usable connection if ABORT TRANSACTION is successfully executed
later, but that's not very likely and, even if it would have happened,
reconnecting instead doesn't really leave us any worse off.

> No, I am not saying any of those.  I hope the previous point clarifies
> what I want to say.

Apologies, but I still cannot see quite what you are getting at.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Fri, May 5, 2017 at 9:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, May 5, 2017 at 11:15 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> I am not saying to rip those changes.  Your most of the mail stresses
>> about the 30-second timeout which you have added in the patch to
>> detect timeouts during failures (rollback processing).  I have no
>> objection to that part of the patch except that still there is a race
>> condition around PQsendQuery and you indicate that it is better to
>> deal it as a separate patch to which I agree.
>
> OK.
>
>> The point of which I am
>> not in total agreement is about the failure other than the time out.
>>
>> +pgfdw_exec_cleanup_query(PGconn *conn, const char *query)
>> {
>> ..
>> + /* Get the result of the query. */
>> + if (pgfdw_get_cleanup_result(conn, endtime, &result))
>> + return false;
>> +
>> + /* Issue a warning if not successful. */
>> + if (PQresultStatus(result) != PGRES_COMMAND_OK)
>> + {
>> + pgfdw_report_error(WARNING, result, conn, true, query);
>> + return false;
>> + }
>> ..
>> }
>>
>> In the above code, if there is a failure due to timeout then it will
>> return from first statement (pgfdw_get_cleanup_result), however, if
>> there is any other failure, then it will issue a warning and return
>> false.  I am not sure if there is a need to return false in the second
>> case, because even without that it will work fine (and there is a
>> chance that it won't drop the connection), but maybe your point is
>> better to be consistent for all kind of error handling in this path.
>
> There are three possible queries that can be executed by
> pgfdw_exec_cleanup_query; let's consider them individually.
>
> 1. ABORT TRANSACTION.  If ABORT TRANSACTION fails, I think that we
> have no alternative but to close the connection.  If we don't, then
> the next local connection that tries to use this connection will
> probably also fail, because it will find the connection inside an
> aborted transaction rather than not in a transaction.  If we do close
> the connection, the next transaction will try to reconnect and
> everything will be fine.
>
> 2. DEALLOCATE ALL.  If DEALLOCATE ALL fails, we do not have to close
> the connection, but the reason why we're running that statement in the
> first place is because we've potentially lost track of which
> statements are prepared on the remote side.  If we don't drop the
> connection, we'll still be confused about that.  Reconnecting will fix
> it.
>

There is a comment in the code which explains why currently we ignore
errors from DEALLOCATE ALL.

* DEALLOCATE ALL only exists in 8.3 and later, so this
* constrains how old a server postgres_fdw can
* communicate with.  We intentionally ignore errors in
* the DEALLOCATE, so that we can hobble along to some
* extent with older servers (leaking prepared statements
* as we go; but we don't really support update operations
* pre-8.3 anyway).

It is not entirely clear to me whether it is only for Commit case or
in general.  From the code, it appears that the comment holds true for
both commit and abort.  If it is for both, then after this patch, the
above comment will not be valid and you might want to update it in
case we want to proceed with your proposed changes for DEALLOCATE ALL
statement.

> 3. ROLLBACK TO SAVEPOINT %d; RELEASE SAVEPOINT %d.  If this fails, the
> local toplevel transaction is doomed.  It would be wrong to try to
> commit on the remote side, because there could be remote changes made
> by the aborted subtransaction which must not survive, and the ROLLBACK
> TO SAVEPOINT %d command is essential in order for us to get rid of
> them, and that has already failed.  So we must eventually abort the
> remote transaction, which we can do either by closing the connection
> or by trying to execute ABORT TRANSACTION.  The former, however, is
> quick, and the latter is very possibly going to involve a long hang,
> so closing the connection seems better.
>
> In all of these cases, the failure of one of these statements seems to
> me to *probably* mean that the remote side is just dead, in which case
> dropping the connection is the only option.  But even if the remote
> side is still alive and the statement failed for some other reason --
> say somebody changed kwlist.h so that ABORT now has to be spelled
> ABURT -- returning true rather than false just causes us to end up
> with a remote connection whose state is out of step with the local
> server state, and such a connection is not going to be usable.  I
> think part of your point is that in case 3, we might still get back to
> a usable connection if ABORT TRANSACTION is successfully executed
> later,
>

Yes.

> but that's not very likely and, even if it would have happened,
> reconnecting instead doesn't really leave us any worse off.
>

I see your point.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Sat, May 6, 2017 at 4:41 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, May 5, 2017 at 9:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, May 5, 2017 at 11:15 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> I am not saying to rip those changes.  Your most of the mail stresses
>>> about the 30-second timeout which you have added in the patch to
>>> detect timeouts during failures (rollback processing).  I have no
>>> objection to that part of the patch except that still there is a race
>>> condition around PQsendQuery and you indicate that it is better to
>>> deal it as a separate patch to which I agree.
>>
>> OK.
>>
>>> The point of which I am
>>> not in total agreement is about the failure other than the time out.
>>>
>>> +pgfdw_exec_cleanup_query(PGconn *conn, const char *query)
>>> {
>>> ..
>>> + /* Get the result of the query. */
>>> + if (pgfdw_get_cleanup_result(conn, endtime, &result))
>>> + return false;
>>> +
>>> + /* Issue a warning if not successful. */
>>> + if (PQresultStatus(result) != PGRES_COMMAND_OK)
>>> + {
>>> + pgfdw_report_error(WARNING, result, conn, true, query);
>>> + return false;
>>> + }
>>> ..
>>> }
>>>
>>> In the above code, if there is a failure due to timeout then it will
>>> return from first statement (pgfdw_get_cleanup_result), however, if
>>> there is any other failure, then it will issue a warning and return
>>> false.  I am not sure if there is a need to return false in the second
>>> case, because even without that it will work fine (and there is a
>>> chance that it won't drop the connection), but maybe your point is
>>> better to be consistent for all kind of error handling in this path.
>>
>> There are three possible queries that can be executed by
>> pgfdw_exec_cleanup_query; let's consider them individually.
>>
>> 1. ABORT TRANSACTION.  If ABORT TRANSACTION fails, I think that we
>> have no alternative but to close the connection.  If we don't, then
>> the next local connection that tries to use this connection will
>> probably also fail, because it will find the connection inside an
>> aborted transaction rather than not in a transaction.  If we do close
>> the connection, the next transaction will try to reconnect and
>> everything will be fine.
>>
>> 2. DEALLOCATE ALL.  If DEALLOCATE ALL fails, we do not have to close
>> the connection, but the reason why we're running that statement in the
>> first place is because we've potentially lost track of which
>> statements are prepared on the remote side.  If we don't drop the
>> connection, we'll still be confused about that.  Reconnecting will fix
>> it.
>>
>
> There is a comment in the code which explains why currently we ignore
> errors from DEALLOCATE ALL.
>
> * DEALLOCATE ALL only exists in 8.3 and later, so this
> * constrains how old a server postgres_fdw can
> * communicate with.  We intentionally ignore errors in
> * the DEALLOCATE, so that we can hobble along to some
> * extent with older servers (leaking prepared statements
> * as we go; but we don't really support update operations
> * pre-8.3 anyway).
>
> It is not entirely clear to me whether it is only for Commit case or
> in general.  From the code, it appears that the comment holds true for
> both commit and abort.  If it is for both, then after this patch, the
> above comment will not be valid and you might want to update it in
> case we want to proceed with your proposed changes for DEALLOCATE ALL
> statement.
>

Apart from this, I don't see any other problem with the patch.
Additionally, I have run the regression tests with the patch and
everything passed.

As a side note, why we want 30-second timeout only for Rollback
related commands and why not for Commit and the Deallocate All as
well?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Fri, May 5, 2017 at 7:11 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> There is a comment in the code which explains why currently we ignore
> errors from DEALLOCATE ALL.
>
> * DEALLOCATE ALL only exists in 8.3 and later, so this
> * constrains how old a server postgres_fdw can
> * communicate with.  We intentionally ignore errors in
> * the DEALLOCATE, so that we can hobble along to some
> * extent with older servers (leaking prepared statements
> * as we go; but we don't really support update operations
> * pre-8.3 anyway).
>
> It is not entirely clear to me whether it is only for Commit case or
> in general.  From the code, it appears that the comment holds true for
> both commit and abort.  If it is for both, then after this patch, the
> above comment will not be valid and you might want to update it in
> case we want to proceed with your proposed changes for DEALLOCATE ALL
> statement.

Oh!  Good catch.  Given that the behavior in question is intentional
there and intended to provide backward compatibility, changing it in
back-branches doesn't seem like a good plan.  I'll adjust the patch so
that it continues to ignore errors in that case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Sat, May 6, 2017 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Oh!  Good catch.  Given that the behavior in question is intentional
> there and intended to provide backward compatibility, changing it in
> back-branches doesn't seem like a good plan.  I'll adjust the patch so
> that it continues to ignore errors in that case.

Updated patch attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On Sat, May 6, 2017 at 10:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, May 6, 2017 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Oh!  Good catch.  Given that the behavior in question is intentional
>> there and intended to provide backward compatibility, changing it in
>> back-branches doesn't seem like a good plan.  I'll adjust the patch so
>> that it continues to ignore errors in that case.
>
> Updated patch attached.
>

Patch looks good to me.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Sun, May 7, 2017 at 12:42 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, May 6, 2017 at 10:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sat, May 6, 2017 at 12:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> Oh!  Good catch.  Given that the behavior in question is intentional
>>> there and intended to provide backward compatibility, changing it in
>>> back-branches doesn't seem like a good plan.  I'll adjust the patch so
>>> that it continues to ignore errors in that case.
>>
>> Updated patch attached.
>
> Patch looks good to me.

I'm having second thoughts based on some more experimentation I did
this morning.  I'll update again once I've had a bit more time to poke
at it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Sun, May 7, 2017 at 11:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I'm having second thoughts based on some more experimentation I did
> this morning.  I'll update again once I've had a bit more time to poke
> at it.

So the issue that I noticed here is that this problem really isn't
confined to abort processing.  If we ROLLBACK TO SAVEPOINT or ABORT
TRANSACTION on an open connection, we really do not know what the
state of that connection is until we get an acknowledgement that the
command completed successfully.  The patch handles that.  However, the
same is true if we're sending a SAVEPOINT or RELEASE SAVEPOINT
command, and the patch that I posted doesn't do anything about those
cases.  I think it would be best to fix all transaction control
commands in a symmetric manner.

Concretely, I think we should replace the abort_cleanup_incomplete
flag from my previous patch with a changing_xact_state flag and set
that flag around all transaction state changes, clearing it when such
changes have succeeded.  On error, the flag remains set, so we know
that the state of that connection is unknown and that we must close it
(killing outer transaction levels as needed).

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> Concretely, I think we should replace the abort_cleanup_incomplete
> flag from my previous patch with a changing_xact_state flag and set
> that flag around all transaction state changes, clearing it when such
> changes have succeeded.  On error, the flag remains set, so we know
> that the state of that connection is unknown and that we must close it
> (killing outer transaction levels as needed).

> Thoughts?

Sounds plausible.
        regards, tom lane



Re: [HACKERS] statement_timeout is not working as expected withpostgres_fdw

From
Kyotaro HORIGUCHI
Date:
Hello,

At Tue, 16 May 2017 12:45:39 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <22556.1494953139@sss.pgh.pa.us>
> Robert Haas <robertmhaas@gmail.com> writes:
> > Concretely, I think we should replace the abort_cleanup_incomplete
> > flag from my previous patch with a changing_xact_state flag and set
> > that flag around all transaction state changes, clearing it when such
> > changes have succeeded.  On error, the flag remains set, so we know
> > that the state of that connection is unknown and that we must close it
> > (killing outer transaction levels as needed).
> 
> > Thoughts?
> 
> Sounds plausible.

I think that the current issue is the usability of the current
connection. Even if any other command should fail, we can
continue using the connection if ABORT TRANSACTION
succeeds. Failure of ABORT immediately means that the connection
is no longer available. If this discuttion is reasonable,
changing_xact_state might be too-much.

By the way if an fdw connection is stalled amid do_sql_command
waiting a result, a cancel request doesn't work (in other words
pgsql_xact_callback is not called) since EINTR is just ignored in
libpq. Maybe we should teach libpq to error out with EINTR with
some additional reasons. PQsetEINTRcallback() or something?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw

From
Ashutosh Bapat
Date:
On Wed, May 17, 2017 at 7:24 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello,
>
> At Tue, 16 May 2017 12:45:39 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <22556.1494953139@sss.pgh.pa.us>
>> Robert Haas <robertmhaas@gmail.com> writes:
>> > Concretely, I think we should replace the abort_cleanup_incomplete
>> > flag from my previous patch with a changing_xact_state flag and set
>> > that flag around all transaction state changes, clearing it when such
>> > changes have succeeded.  On error, the flag remains set, so we know
>> > that the state of that connection is unknown and that we must close it
>> > (killing outer transaction levels as needed).
>>
>> > Thoughts?
>>
>> Sounds plausible.
>
> I think that the current issue is the usability of the current
> connection. Even if any other command should fail, we can
> continue using the connection if ABORT TRANSACTION
> succeeds. Failure of ABORT immediately means that the connection
> is no longer available. If this discuttion is reasonable,
> changing_xact_state might be too-much.

pgfdw_get_result() raises an ERROR, which will end up in
pgfdw_xact_callback with ABORT. There if we found connection in bad
state, we will slam it down. The problem seems to be that waits for
none of the transaction related commands are interruptible; both while
sending the command and while receiving its result. Robert's
suggestion is to fix the later for transaction command. So, it looks
good to me. Can you please elaborate what's too much with
changing_xact_state?
>
> By the way if an fdw connection is stalled amid do_sql_command
> waiting a result, a cancel request doesn't work (in other words
> pgsql_xact_callback is not called) since EINTR is just ignored in
> libpq. Maybe we should teach libpq to error out with EINTR with
> some additional reasons. PQsetEINTRcallback() or something?

Right do_sql_command() is uninterruptible and so is PQexec() used for
sending "ABORT TRANSACTION" and "ROLLBACK TO SAVEPOINT". Robert's
suggestion seems to be that we make them interruptible with
changing_xact_state, so that we could notice a previously failed in
transaction related command in pgfdw_xact_callback(). So, his approach
scores on that front as well.



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



On Tue, May 16, 2017 at 9:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, May 7, 2017 at 11:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I'm having second thoughts based on some more experimentation I did
>> this morning.  I'll update again once I've had a bit more time to poke
>> at it.
>
> So the issue that I noticed here is that this problem really isn't
> confined to abort processing.  If we ROLLBACK TO SAVEPOINT or ABORT
> TRANSACTION on an open connection, we really do not know what the
> state of that connection is until we get an acknowledgement that the
> command completed successfully.  The patch handles that.  However, the
> same is true if we're sending a SAVEPOINT or RELEASE SAVEPOINT
> command, and the patch that I posted doesn't do anything about those
> cases.  I think it would be best to fix all transaction control
> commands in a symmetric manner.
>

+1.  Why not similar behavior for any other statements executed in
this module by do_sql_command?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Wed, May 17, 2017 at 6:57 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> +1.  Why not similar behavior for any other statements executed in
> this module by do_sql_command?

The other cases are not quite the same situation.  It would be good to
accept interrupts in all cases, but there's no problem with a session
continuing to be used after a failure in configure_remote_session()
because the connection hasn't been entered in the hash table at that
point yet, and the TRY/CATCH block in connect_pg_server() ensures that
the connection also gets closed.  So we don't need to worry about
those statements leaving behind messed-up sessions; they won't; only
the transaction control commands have that part of the problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Thu, May 18, 2017 at 7:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, May 17, 2017 at 6:57 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> +1.  Why not similar behavior for any other statements executed in
>> this module by do_sql_command?
>
> The other cases are not quite the same situation.  It would be good to
> accept interrupts in all cases,
>

Yes, that's one of the main things I was indicating for making the
behavior same, but again that could be done as a separate patch as
well.

> but there's no problem with a session
> continuing to be used after a failure in configure_remote_session()
> because the connection hasn't been entered in the hash table at that
> point yet, and the TRY/CATCH block in connect_pg_server() ensures that
> the connection also gets closed.  So we don't need to worry about
> those statements leaving behind messed-up sessions; they won't; only
> the transaction control commands have that part of the problem.
>

Agreed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Tue, May 16, 2017 at 9:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sun, May 7, 2017 at 11:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I'm having second thoughts based on some more experimentation I did
>> this morning.  I'll update again once I've had a bit more time to poke
>> at it.
>
> So the issue that I noticed here is that this problem really isn't
> confined to abort processing.  If we ROLLBACK TO SAVEPOINT or ABORT
> TRANSACTION on an open connection, we really do not know what the
> state of that connection is until we get an acknowledgement that the
> command completed successfully.  The patch handles that.  However, the
> same is true if we're sending a SAVEPOINT or RELEASE SAVEPOINT
> command, and the patch that I posted doesn't do anything about those
> cases.  I think it would be best to fix all transaction control
> commands in a symmetric manner.
>
> Concretely, I think we should replace the abort_cleanup_incomplete
> flag from my previous patch with a changing_xact_state flag and set
> that flag around all transaction state changes, clearing it when such
> changes have succeeded.  On error, the flag remains set, so we know
> that the state of that connection is unknown and that we must close it
> (killing outer transaction levels as needed).
>
> Thoughts?

I tried following the sketch stated above, and here's what I added to
v2 of the patch. In begin_remote_xact when savepoint is send to the
remote server through do_sql_command I set the changing_xact_state
flag and when it successfully returns from there the flag is unset.
Similar behaviour is followed in pgfdw_subxact_callback for release
savepoint. Additionally, I added this flag set-unset for start
transaction command in begin_remote_xact. Enlighten me if I've
msised/misunderstood something here. I ran the regress test for
postgres_fdw and it went on fine.

I have a couple of concerns here, since there is only flag required
for the purpose, it's  name to be changing_xact_state doesn't suit at
some places particularly in pgfdw_subxact_callback, not sure should
change comment or variable name

/* Disarm abort_cleanup_incomplete if it all worked. */
+ entry->changing_xact_state = abort_cleanup_failure;

Also, by any chance should we add a test-case for this?

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On Fri, Jun 2, 2017 at 3:48 AM, Rafia Sabih
<rafia.sabih@enterprisedb.com> wrote:
> I tried following the sketch stated above, and here's what I added to
> v2 of the patch. In begin_remote_xact when savepoint is send to the
> remote server through do_sql_command I set the changing_xact_state
> flag and when it successfully returns from there the flag is unset.
> Similar behaviour is followed in pgfdw_subxact_callback for release
> savepoint. Additionally, I added this flag set-unset for start
> transaction command in begin_remote_xact. Enlighten me if I've
> msised/misunderstood something here.

You didn't catch all of the places where do_sql_command() is used to
change the transaction state, and you didn't update the comments to
match the code changes.

> I ran the regress test for
> postgres_fdw and it went on fine.

Well, that's good as far as it goes, but it doesn't prove much, since
this patch mostly changes the behavior when there are errors or
interrupts involved, and that's not going to happen in the regression
tests.

> I have a couple of concerns here, since there is only flag required
> for the purpose, it's  name to be changing_xact_state doesn't suit at
> some places particularly in pgfdw_subxact_callback, not sure should
> change comment or variable name
>
> /* Disarm abort_cleanup_incomplete if it all worked. */
> + entry->changing_xact_state = abort_cleanup_failure;

Seems pretty obvious that would need to be updated, at least insofar
as making the comment match the new structure member name.  I mean,
there's often some question regarding the degree to which existing
comments should be updated, but it hardly makes sense to add a new
comment that isn't consistent with the rest of the patch, right?

> Also, by any chance should we add a test-case for this?

I don't see how to do that.  It could possibly be done with the TAP
framework, but that exceeds my abilities.

Here's an updated patch with a bunch of cosmetic fixes, plus I
adjusted things so that do_sql_command() is more interruptible.  I
tested it manually and it seems to work OK.  I'll commit and
back-patch this version, barring objections or further suggestions for
improvement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On Sat, Jun 3, 2017 at 1:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jun 2, 2017 at 3:48 AM, Rafia Sabih
> <rafia.sabih@enterprisedb.com> wrote:
>
> I don't see how to do that.  It could possibly be done with the TAP
> framework, but that exceeds my abilities.
>
> Here's an updated patch with a bunch of cosmetic fixes, plus I
> adjusted things so that do_sql_command() is more interruptible.  I
> tested it manually and it seems to work OK.  I'll commit and
> back-patch this version, barring objections or further suggestions for
> improvement.
>

No objections from me, updated patch looks good.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Wed, Jun 7, 2017 at 11:20 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Jun 3, 2017 at 1:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Jun 2, 2017 at 3:48 AM, Rafia Sabih
>> <rafia.sabih@enterprisedb.com> wrote:
>> I don't see how to do that.  It could possibly be done with the TAP
>> framework, but that exceeds my abilities.
>>
>> Here's an updated patch with a bunch of cosmetic fixes, plus I
>> adjusted things so that do_sql_command() is more interruptible.  I
>> tested it manually and it seems to work OK.  I'll commit and
>> back-patch this version, barring objections or further suggestions for
>> improvement.
>
> No objections from me, updated patch looks good.

OK, done.  Further testing nevertheless appreciated, if anyone feels the urge.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company