Thread: Cancel race condition

Cancel race condition

From
Shay Rojansky
Date:
Hi everyone.

I'm working on Npgsql and have run into a race condition when cancelling. The issue is described in the following 10-year-old thread, and I'd like to make sure things are still the same: http://www.postgresql.org/message-id/27126.1126649920@sss.pgh.pa.us

My questions/comments:
  • Does PostgreSQL *guarantee* that once the connection used to send the cancellation request is closed by the server, the cancellation has been delivered (as mentioned by Tom)? In other words, should I be designing a .NET driver around this behavior?
  • If so, may I suggest to update the protocol docs to reflect this (http://www.postgresql.org/docs/current/static/protocol-flow.html#AEN103033)
  • I'm not sure if there's some sort of feature/request list for protocol 4, but it may make sense to provide a simpler solution for this problem. One example would be for the client to send some sort of numeric ID identifying each comment (some autoincrement), and to include that ID when cancelling. I'm not sure the benefits are worth the extra payload but it may be useful for other functionality as well (tracking/logging)? Just a thought.
Thanks,

Shay

Re: Cancel race condition

From
Tom Lane
Date:
Shay Rojansky <roji@roji.org> writes:
> My questions/comments:
>    - Does PostgreSQL *guarantee* that once the connection used to send the
>    cancellation request is closed by the server, the cancellation has been
>    delivered (as mentioned by Tom)? In other words, should I be designing a
>    .NET driver around this behavior?

The signal's been *sent*.  Whether it's been *delivered* is something
you'd have to ask your local kernel hacker.  The POSIX standard appears
to specifically disclaim any such guarantee; in fact, it doesn't even
entirely promise that a self-signal is synchronous.  There are also
issues like what if the target process currently has signals blocked;
does "delivery" mean that the signal handler's been entered, or something
weaker?

In any case, Postgres has always considered that query cancel is a "best
effort" thing, so even if I thought this was 100% portably reliable,
I would not be in favor of promising anything in the docs.
        regards, tom lane



Re: Cancel race condition

From
Shay Rojansky
Date:
Ah, OK - I wasn't aware that cancellation was actually delivered as a regular POSIX signal... You're right about the lack of guarantees then.

In that case, I'm guessing not much could be do to guarantee sane cancellation behavior... I do understand the "best effort" idea around cancellations. However, it seems different to say "we'll try our best and the cancellation may not be delivered" (no bad consequences except maybe performance), and to say "we'll try our best but the cancellation may be delivered randomly to any query you send from the moment you send the cancellation". The second makes it very difficult to design a 100% sane, deterministic application... Any plans to address this in protocol 4?

Would you have any further recommendations or guidelines to make the situation as sane as possible? I guess I could block any new SQL queries while a cancellation on that connection is still outstanding (meaning that the cancellation connection hasn't yet been closed). As you mentioned this wouldn't be a 100% solution since it would only cover signal sending, but better than nothing?


On Tue, Jun 9, 2015 at 1:01 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Shay Rojansky <roji@roji.org> writes:
> My questions/comments:
>    - Does PostgreSQL *guarantee* that once the connection used to send the
>    cancellation request is closed by the server, the cancellation has been
>    delivered (as mentioned by Tom)? In other words, should I be designing a
>    .NET driver around this behavior?

The signal's been *sent*.  Whether it's been *delivered* is something
you'd have to ask your local kernel hacker.  The POSIX standard appears
to specifically disclaim any such guarantee; in fact, it doesn't even
entirely promise that a self-signal is synchronous.  There are also
issues like what if the target process currently has signals blocked;
does "delivery" mean that the signal handler's been entered, or something
weaker?

In any case, Postgres has always considered that query cancel is a "best
effort" thing, so even if I thought this was 100% portably reliable,
I would not be in favor of promising anything in the docs.

                        regards, tom lane

Re: Cancel race condition

From
Robert Haas
Date:
On Tue, Jun 9, 2015 at 4:42 AM, Shay Rojansky <roji@roji.org> wrote:
> Ah, OK - I wasn't aware that cancellation was actually delivered as a
> regular POSIX signal... You're right about the lack of guarantees then.
>
> In that case, I'm guessing not much could be do to guarantee sane
> cancellation behavior... I do understand the "best effort" idea around
> cancellations. However, it seems different to say "we'll try our best and
> the cancellation may not be delivered" (no bad consequences except maybe
> performance), and to say "we'll try our best but the cancellation may be
> delivered randomly to any query you send from the moment you send the
> cancellation". The second makes it very difficult to design a 100% sane,
> deterministic application... Any plans to address this in protocol 4?
>
> Would you have any further recommendations or guidelines to make the
> situation as sane as possible? I guess I could block any new SQL queries
> while a cancellation on that connection is still outstanding (meaning that
> the cancellation connection hasn't yet been closed). As you mentioned this
> wouldn't be a 100% solution since it would only cover signal sending, but
> better than nothing?

Blocking new queries seems like a good idea.  Note that the entire
transaction (whether single-statement or multi-statement) will be
aborted, or at least the currently-active subtransaction, not just the
current query.  If you're using single-statement transactions I guess
there is not much practical difference, but if you are using
multi-statement transactions the application kind of needs to be aware
of this, since it needs to know that any work it did got rolled back,
and everything's going to fail up until the current (sub)transaction
is rolled back.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Cancel race condition

From
Shay Rojansky
Date:
Thanks for the extra consideration Robert.

Since I'm implementing a generic driver, users can send either single-statement transactions or actual multiple-statement transaction. However, whether we're in a transaction or not doesn't seem to affect Npgsql logic (unless I'm missing something) - if the cancellation does hit a query the transaction will be cancelled and it's up to the user to roll it back as is required in PostgreSQL...


On Thu, Jun 11, 2015 at 9:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jun 9, 2015 at 4:42 AM, Shay Rojansky <roji@roji.org> wrote:
> Ah, OK - I wasn't aware that cancellation was actually delivered as a
> regular POSIX signal... You're right about the lack of guarantees then.
>
> In that case, I'm guessing not much could be do to guarantee sane
> cancellation behavior... I do understand the "best effort" idea around
> cancellations. However, it seems different to say "we'll try our best and
> the cancellation may not be delivered" (no bad consequences except maybe
> performance), and to say "we'll try our best but the cancellation may be
> delivered randomly to any query you send from the moment you send the
> cancellation". The second makes it very difficult to design a 100% sane,
> deterministic application... Any plans to address this in protocol 4?
>
> Would you have any further recommendations or guidelines to make the
> situation as sane as possible? I guess I could block any new SQL queries
> while a cancellation on that connection is still outstanding (meaning that
> the cancellation connection hasn't yet been closed). As you mentioned this
> wouldn't be a 100% solution since it would only cover signal sending, but
> better than nothing?

Blocking new queries seems like a good idea.  Note that the entire
transaction (whether single-statement or multi-statement) will be
aborted, or at least the currently-active subtransaction, not just the
current query.  If you're using single-statement transactions I guess
there is not much practical difference, but if you are using
multi-statement transactions the application kind of needs to be aware
of this, since it needs to know that any work it did got rolled back,
and everything's going to fail up until the current (sub)transaction
is rolled back.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company