Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw
Date
Msg-id CAA4eK1JDQrauthynyN9HBFtu+qsskN2eQK=s7wrb3hDPC2poPg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, May 4, 2017 at 8:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, May 4, 2017 at 7:13 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> In pgfdw_xact_callback, if the execution of ABORT TRANSACTION fails
>> due to any reason then I think it will close the connection.  The
>> relavant code is:
>> if (PQstatus(entry->conn) != CONNECTION_OK ||
>> PQtransactionStatus(entry->conn) != PQTRANS_IDLE)
>>
>> Basically, if abort transaction fails then transaction status won't be
>> PQTRANS_IDLE.
>
> I don't think that's necessarily true.  Look at this code:
>
>              /* Rollback all remote subtransactions during abort */
>              snprintf(sql, sizeof(sql),
>                       "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
>                       curlevel, curlevel);
>              res = PQexec(entry->conn, sql);
>              if (PQresultStatus(res) != PGRES_COMMAND_OK)
>                  pgfdw_report_error(WARNING, res, entry->conn, true, sql);
>              else
>                  PQclear(res);
>
> If that sql command somehow returns a result status other than
> PGRES_COMMAND_OK, like say due to a local OOM failure or whatever, the
> connection is PQTRANS_IDLE but the rollback wasn't actually done.
>

I have tried to verify above point and it seems to me in such a
situation the transaction status will be either PQTRANS_INTRANS or
PQTRANS_INERROR.  I have tried to mimic "out of memory" failure by
making PQmakeEmptyPGresult return NULL (malloc failure).  AFAIU, it
will make the state as PQTRANS_IDLE only when the statement is
executed successfully.

>> Also, does it matter if pgfdw_subxact_callback fails
>> during execution of "ROLLBACK TO SAVEPOINT", because anyway user needs
>> to execute rollback to savepoint command before proceeding and that
>> should be good enough?
>
> I don't understand this.  If pgfdw_subxact_callback doesn't roll the
> remote side back to the savepoint, how is it going to get done later?
> There's no code in postgres_fdw other than that code to issue the
> rollback.
>

It is done via toplevel transaction ABORT.  I think how it works is
after "ROLLBACK TO SAVEPOINT" failure, any next execution is going to
fail.  Let us say if we issue Commit after failure, it will try to
execute RELEASE SAVEPOINT which will fail and lead to toplevel
transaction ABORT.  Now if the toplevel transaction ABORT also fails,
it will drop the connection.

>> About patch:
>>
>> + /* Rollback all remote subtransactions during abort */
>> + snprintf(sql, sizeof(sql),
>> + "ROLLBACK TO SAVEPOINT s%d; RELEASE SAVEPOINT s%d",
>> + curlevel, curlevel);
>> + if (!pgfdw_exec_cleanup_query(entry->conn, sql))
>> + abort_cleanup_failure = true;
>>
>> If the above execution fails due to any reason, then it won't allow
>> even committing the mail transaction which I am not sure is the right
>> thing to do.
>
> Well, as I said in my reply to Tushar, I think it's the only correct
> thing to do.  Suppose you do the following:
>
> (1) make a change to the foreign table
> (2) enter a subtransaction
> (3) make another change to the foreign table
> (4) roll back the subtransaction
> (5) commit the transaction
>
> If the remote rollback gets skipped in step 4, it is dead wrong for
> step 5 to commit the remote transaction anyway.  Both the change made
> in step (1) and the change made in step (3) would get made on the
> remote side, which is 100% wrong.  Given the failure of the rollback
> in step 4, there is no way to achieve the desired outcome of
> committing the step (1) change and rolling back the step (3) change.
> The only way forward is to roll back everything - i.e. force a
> toplevel abort.
>

Agreed, in such a situation toplevel transaction abort is the only way
out.  However, it seems to me that will anyway happen in current code
even if we don't try to force it via abort_cleanup_failure.  I
understand that it might be better to force it as you have done in the
patch, but not sure if it is good to drop the connection where
previously it could have done without it (for ex. if the first
statement Rollback To Savepoint fails, but ABORT succeeds).  I feel it
is worth considering whether such a behavior difference is okay as you
have proposed to backpatch it.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] compiler warning with VS 2017
Next
From: Amit Kapila
Date:
Subject: Re: [HACKERS] statement_timeout is not working as expected with postgres_fdw