Thread: Inexplicable UPDATE...RETURNING behaviour
Hello all, We are seeing an inexplicable behaviour when issuing an "UPDATE..RETURNING" statement. I am unsure if it is a Postgres bug.Additional eyes-on would be much appreicated. When issuing the following statement we are seeing multiple rows UPDATE'd despite the use of LIMIT 1 and despite the "uid"column in the "some_queue" table having a PRIMARY KEY constraint on it: UPDATE queue.some_queue AS q SET (state, awaiting) = ('executing', FALSE) FROM (SELECT uid FROM queue.some_queue WHERE awaiting AND process_after <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC' ORDER BY process_after ASC FOR UPDATE SKIP LOCKED LIMIT 1) AS dq(uid) WHERE q.uid = dq.uid RETURNING q.uid; However, when using the following statement, which (AFAIK) is semantically equivalent, we see only a single row being updated/dequeued: UPDATE queue.some_queue AS q SET (state, awaiting) = ('executing', FALSE) WHERE uid = (SELECT uid FROM queue.some_queue WHERE awaiting AND process_after <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC' ORDER BY process_after ASC FOR UPDATE SKIP LOCKED LIMIT 1) RETURNING uid; IMO the two statements should yield the same result. But, we see the first one updating multiple rows and therefore dequeingmultiple uids, yet the second one functions as intended (ie. single item is dequeued). We can replicate this locally in tests but I can't explain it. Is this a bug, or am I overlooking something? Cheers, -Joe PG. Postgres 10.6 in production, and the same behaviour with 10.5 + 11.2 in dev.
Joe Wildish <joe-postgresql.org@elusive.cx> writes: > We are seeing an inexplicable behaviour when issuing an "UPDATE..RETURNING" statement. I am unsure if it is a Postgresbug. Additional eyes-on would be much appreicated. > When issuing the following statement we are seeing multiple rows UPDATE'd despite the use of LIMIT 1 and despite the "uid"column in the "some_queue" table having a PRIMARY KEY constraint on it: > UPDATE queue.some_queue AS q > SET (state, awaiting) = ('executing', FALSE) > FROM (SELECT uid > FROM queue.some_queue > WHERE awaiting > AND process_after <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC' > ORDER BY process_after ASC > FOR UPDATE SKIP LOCKED > LIMIT 1) > AS dq(uid) > WHERE q.uid = dq.uid > RETURNING q.uid; Yeah, there was another similar complaint a few weeks ago --- has this suddenly gotten to be a popular coding idea? The basic problem with what you have here is that FOR UPDATE (especially with SKIP LOCKED) makes the sub-select's output unstable by definition. If it's executed more than once then you might get different rows back, allowing the outer UPDATE's join to potentially match multiple rows from the outer instance of queue.some_queue. Typically, since it's LIMIT 1, I'd think that the planner would put dq on the outside of a nestloop plan and you'd escape seeing any problem --- but if it gets put on the inside of a nestloop, it's going to misbehave. There are (at least) two different ways that the sub-select's output might change when re-executed, even though it's still using the same snapshot as before: 1. If some other transaction releases a row lock in between, SKIP LOCKED might not skip that row any more. 2. The row returned the first time will absolutely not get chosen the second time, given this particular query formulation, because its latest updated version will have awaiting = false thanks to the action of the outer UPDATE, so it'll fail the inner WHERE test. The way I'd recommend fixing it is to put the FOR UPDATE into a WITH to guarantee single execution: WITH dq(uid) AS (SELECT uid ... LIMIT 1) UPDATE queue.some_queue q SET ... FROM dq WHERE q.uid = dq.uid RETURNING q.uid; regards, tom lane
Hi Tom, > On 16 Apr 2019, at 00:58, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Joe Wildish <joe-postgresql.org@elusive.cx> writes: >> We are seeing an inexplicable behaviour when issuing an "UPDATE..RETURNING" statement. I am unsure if it is a Postgresbug. Additional eyes-on would be much appreicated. > >> When issuing the following statement we are seeing multiple rows UPDATE'd despite the use of LIMIT 1 and despite the "uid"column in the "some_queue" table having a PRIMARY KEY constraint on it: > >> UPDATE queue.some_queue AS q >> SET (state, awaiting) = ('executing', FALSE) >> FROM (SELECT uid >> FROM queue.some_queue >> WHERE awaiting >> AND process_after <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC' >> ORDER BY process_after ASC >> FOR UPDATE SKIP LOCKED >> LIMIT 1) >> AS dq(uid) >> WHERE q.uid = dq.uid >> RETURNING q.uid; > > Yeah, there was another similar complaint a few weeks ago --- has this > suddenly gotten to be a popular coding idea? I can't comment on that specifically. However, in my case, the reason this came up was simply that the original code wasissuing a SELECT to grab a UID and then immediately issuing an UPDATE against the same UID. This isn't necessarily a problem,of course, given the semantics of FOR UPDATE SKIP LOCKED LIMIT 1. It just seemed neater to have a single statementhandle the dequeue operation. An added bonus was that tracking statements via pg_stat_activity for queue-relatedoperations became easier to comprehend, as the dequeue was no longer split between two statements. > The basic problem with what you have here is that FOR UPDATE (especially > with SKIP LOCKED) makes the sub-select's output unstable by definition. > If it's executed more than once then you might get different rows back, > allowing the outer UPDATE's join to potentially match multiple rows from > the outer instance of queue.some_queue. Typically, since it's LIMIT 1, > I'd think that the planner would put dq on the outside of a nestloop plan > and you'd escape seeing any problem --- but if it gets put on the inside > of a nestloop, it's going to misbehave. > ... 8< ... Unfortunately I can't remember exactly what the plans were doing exactly --- we did take a look at EXPLAIN to see if we couldfigure out an explanation for the behaviour, but I foolishly didn't keep a copy of the outputs. > The way I'd recommend fixing it is to put the FOR UPDATE into a WITH > to guarantee single execution: > > WITH dq(uid) AS (SELECT uid ... LIMIT 1) > UPDATE queue.some_queue q SET ... > FROM dq > WHERE q.uid = dq.uid > RETURNING q.uid; Thanks. We are actually now running a version of the code now that does "UPDATE .. WHERE q.uid = (SELECT .. LIMIT 1) RETURNINGq.uid" and the multiple dequeues have gone away. The subselect therefore appears to only being executed once inthis scenario too. Ironically my original version of this code used the WITH construct. I switched to using the subselect instead, purely tomake it easier to write a query over pg_stat_activity that classified each statement on the basis of if they were INSERTing,UPDATEing or DELETEing from the queues :-) Thanks for you help. Cheers, -Joe