Home > mailing lists

awkward cancellation of parallel queries on standby. - Mailing list pgsql-hackers

From	Jeff Janes
Subject	awkward cancellation of parallel queries on standby.
Date	March 26, 2023 15:12:48
Msg-id	CAMkU=1zCi_8E3aqi2iKBa8HOA8MFJ=7GrQ5AS_uiw2NGagS4oA@mail.gmail.com Whole thread
Responses	Re: awkward cancellation of parallel queries on standby.
List	pgsql-hackers

Tree view

When a parallel query gets cancelled on a standby due to max_standby_streaming_delay, it happens rather awkwardly. I get two errors stacked up, a query cancellation followed by a connection termination.

I use `pgbench -R 1 -T3600 -P5` on the master to generate a light but steady stream of HOT pruning records, and then run `select sum(a.abalance*b.abalance) from pgbench_accounts a join pgbench_accounts b using (bid);` on the standby not in a transaction block to be a long-running parallel query (scale factor of 20)

I also set max_standby_streaming_delay = 0. That isn't necessary, but it saves wear and tear on my patience.

ERROR: canceling statement due to conflict with recovery
DETAIL: User query might have needed to see row versions that must be removed.
FATAL: terminating connection due to conflict with recovery
DETAIL: User query might have needed to see row versions that must be removed.

This happens quite reliably. In psql, these sometimes both show up immediately, and sometimes only the first one shows up immediately and then the second one appears upon the next communication to the backend.

I don't know if this is actually a problem. It isn't for me as I don't do this kind of thing outside of testing, but it seems untidy and I can see it being frustrating from a catch-and-retry perspective and from a log-spam perspective.

It looks like the backend gets signalled by the startup process, and then it signals the postmaster to signal the parallel workers, and then they ignore it for a quite long time (tens to hundreds of ms). By the time they get around responding, someone has decided to escalate things. Which doesn't seem to be useful, because no one can do anything until the workers respond anyway.

This behavior seems to go back a long way, but the propensity for both messages to show up at the same time vs. in different round-trips changes from version to version.

Is this something we should do something about?

Cheers,

Jeff

pgsql-hackers by date:

From: Justin Pryzby
Date: 26 March 2023, 14:51:25
Subject: Re: CREATE INDEX CONCURRENTLY on partitioned index

From: Hannu Krosing
Date: 26 March 2023, 15:19:18
Subject: Re: Disable vacuuming to provide data history

awkward cancellation of parallel queries on standby. - Mailing list pgsql-hackers

Previous

Next