Re: Make query cancellation keys longer - Mailing list pgsql-hackers

From Jelte Fennema-Nio
Subject Re: Make query cancellation keys longer
Date
Msg-id CAGECzQQ2osT_XMpByXELiYwxMmd4EV5SPF+n0azCEiVvaRVcjw@mail.gmail.com
Whole thread Raw
In response to Re: Make query cancellation keys longer  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Make query cancellation keys longer
List pgsql-hackers
On Thu, 4 Jul 2024 at 12:32, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> We currently don't do any locking on the ProcSignal array. For query
> cancellations, that's good because a query cancel packet is processed
> without having a PGPROC entry, so we cannot take LWLocks. We could use
> spinlocks though. In this patch, I used memory barriers to ensure that
> we load/store the pid and the cancellation key in a sensible order, so
> that you cannot e.g. send a cancellation signal to a backend that's just
> starting up and hasn't advertised its cancellation key in ProcSignal
> yet. But I think this might be simpler and less error-prone by just
> adding a spinlock to each ProcSignal slot. That would also fix the
> existing race condition where we might set the pss_signalFlags flag for
> a slot, when the process concurrently terminates and the slot is reused
> for a different process. Because of that, we currently have this caveat:
> "... all the signals are such that no harm is done if they're mistakenly
> fired". With a spinlock, we could eliminate that race.

I think a spinlock would make this thing a whole concurrency stuff a
lot easier to reason about.

+   slot->pss_cancel_key_valid = false;
+   slot->pss_cancel_key = 0;

If no spinlock is added, I think these accesses should still be made
atomic writes. Otherwise data-races on those fields are still
possible, resulting in undefined behaviour. The memory barriers you
added don't prevent that afaict. With atomic operations there are
still race conditions, but no data-races.

Actually it seems like that same argument applies to the already
existing reading/writing of pss_pid: it's written/read using
non-atomic operations so data-races can occur and thus undefined
behaviour too.

-    volatile pid_t pss_pid;
+    pid_t        pss_pid;

Why remove the volatile modifier?



pgsql-hackers by date:

Previous
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: iso-8859-1 annotation '-cim' in source code
Next
From: Aleksander Alekseev
Date:
Subject: Re: [PATCH] Add min/max aggregate functions to BYTEA