Hi Tom,
> On 16 Apr 2019, at 00:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Joe Wildish <joe-postgresql.org@elusive.cx> writes:
>> We are seeing an inexplicable behaviour when issuing an "UPDATE..RETURNING" statement. I am unsure if it is a
Postgresbug. Additional eyes-on would be much appreicated.
>
>> When issuing the following statement we are seeing multiple rows UPDATE'd despite the use of LIMIT 1 and despite the
"uid"column in the "some_queue" table having a PRIMARY KEY constraint on it:
>
>> UPDATE queue.some_queue AS q
>> SET (state, awaiting) = ('executing', FALSE)
>> FROM (SELECT uid
>> FROM queue.some_queue
>> WHERE awaiting
>> AND process_after <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC'
>> ORDER BY process_after ASC
>> FOR UPDATE SKIP LOCKED
>> LIMIT 1)
>> AS dq(uid)
>> WHERE q.uid = dq.uid
>> RETURNING q.uid;
>
> Yeah, there was another similar complaint a few weeks ago --- has this
> suddenly gotten to be a popular coding idea?
I can't comment on that specifically. However, in my case, the reason this came up was simply that the original code
wasissuing a SELECT to grab a UID and then immediately issuing an UPDATE against the same UID. This isn't necessarily a
problem,of course, given the semantics of FOR UPDATE SKIP LOCKED LIMIT 1. It just seemed neater to have a single
statementhandle the dequeue operation. An added bonus was that tracking statements via pg_stat_activity for
queue-relatedoperations became easier to comprehend, as the dequeue was no longer split between two statements.
> The basic problem with what you have here is that FOR UPDATE (especially
> with SKIP LOCKED) makes the sub-select's output unstable by definition.
> If it's executed more than once then you might get different rows back,
> allowing the outer UPDATE's join to potentially match multiple rows from
> the outer instance of queue.some_queue. Typically, since it's LIMIT 1,
> I'd think that the planner would put dq on the outside of a nestloop plan
> and you'd escape seeing any problem --- but if it gets put on the inside
> of a nestloop, it's going to misbehave.
> ... 8< ...
Unfortunately I can't remember exactly what the plans were doing exactly --- we did take a look at EXPLAIN to see if we
couldfigure out an explanation for the behaviour, but I foolishly didn't keep a copy of the outputs.
> The way I'd recommend fixing it is to put the FOR UPDATE into a WITH
> to guarantee single execution:
>
> WITH dq(uid) AS (SELECT uid ... LIMIT 1)
> UPDATE queue.some_queue q SET ...
> FROM dq
> WHERE q.uid = dq.uid
> RETURNING q.uid;
Thanks. We are actually now running a version of the code now that does "UPDATE .. WHERE q.uid = (SELECT .. LIMIT 1)
RETURNINGq.uid" and the multiple dequeues have gone away. The subselect therefore appears to only being executed once
inthis scenario too.
Ironically my original version of this code used the WITH construct. I switched to using the subselect instead, purely
tomake it easier to write a query over pg_stat_activity that classified each statement on the basis of if they were
INSERTing,UPDATEing or DELETEing from the queues :-)
Thanks for you help.
Cheers,
-Joe