Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore
Date
Msg-id 9d7ba3ac-d660-483e-8f68-9096a2464e90@iki.fi
Whole thread Raw
In response to Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore  ("Jelte Fennema-Nio" <postgres@jeltef.nl>)
Responses Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore
Re: Don't use the deprecated and insecure PQcancel in our frontend tools anymore
List pgsql-hackers
On 06/03/2026 04:12, Jelte Fennema-Nio wrote:
> On Thu Mar 5, 2026 at 7:30 PM CET, Heikki Linnakangas wrote:
>> It took me a while to get the big picture of how this works. cancel.c 
>> could use some high-level comments explaining how to use the facility; 
>> it's a real mixed bag right now.
> 
> Attached is a version with a bunch more comments. I agree this cancel
> logic is hard to understand without them. It took me quite a while to
> understand it myself. (I don't think the code got any harder to
> understand with these changes though, the exact same complexity was
> already there for Windows. But I agree more commends are good.)

Thanks. I agree it was complicated before these patches.

>> This is racy, if the cancellation thread doesn't immediately process 
>> the wakeup. For example, because it's still busy processing a previous 
>> wakeup, because there's a network hiccup or something. By the time the 
>> cancellation thread runs, the main thread might already be running a 
>> different query than it was when the user hit CTRL-C.
> 
> I now noted this in one of the new comments. I don't think there's a way
> around this race condition entirely. It's simply a limitation of our
> cancel protocol (because it's impossible to specify which query on a
> connection should be cancelled).

That's true, but I still wonder if this could make it much worse.

> In theory we could reduce the window for the race, by having all
> frontend tools use async connections and have the main thread wait for
> either the self-pipe or a cancel. That way it would be more similar to
> the previous signal code in behaviour. That's a much bigger lift though,
> i.e. all PQexec and PQgetResult calls would need to be modified. My
> proposed change doesn't require changing the callsites at all.

Yeah, it does have that advantage..

One simple thing we could is to remember the "generation" in the signal 
handler, and store it in another global variable ("cancelledGeneration" 
or such). In the cancel thread, check that the generation matches; 
otherwise the thread is about to send a cancellation to a query that 
already finished, and should not send it.

I worry how this behaves if establishing the cancel connection gets 
stuck for a long time. Because of a network hiccup, for example. That's 
also not a new problem though; it's perhaps even worse today, if the 
signal handler gets stuck for a long time, trying to establish the 
connection. Still, would be good to do some testing with a bad network.

- Heikki




pgsql-hackers by date:

Previous
From: Lukas Fittl
Date:
Subject: Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Next
From: Nathan Bossart
Date:
Subject: Re: pg_dumpall --roles-only interact with other options