=?UTF-8?B?R3nFkXrFkSBQYXBw?= <gyozo.papp@meltwater.com> writes:
> A periodic task moves rows form `available_jobs` to `assigned_jobs`
> the jobs that has been assigned:
> WITH jobs_to_assign AS (
> DELETE FROM available_jobs
> USING v_assignable_jobs
> WHERE available_jobs.id = v_assignable_jobs.id
> RETURNING available_jobs.*
> )
> INSERT
> INTO assigned_jobs(id, url)
> SELECT id, url
> FROM jobs_to_assign
> RETURNING *;
> And a similar one that “unassigns” failed assigned jobs, that is,
> pushes back rows from `assigned_jobs` to `available_jobs`. And it
> sometimes fails with:
> ERROR: duplicate key value violates unique constraint "jobs_pkey"
> DETAIL: Key (id)=(1fd0626c-f953-3278-82a1-8e4320d28914) already
> exists.
What is v_assignable_jobs?
If, as I suspect, it's a view involving these same tables, then
likely your issue is that you aren't reading that view with suitable
locking, so that sometimes it will return stale rows that describe
no-longer-assignable jobs.
DELETE doesn't have an option to apply FOR UPDATE to USING tables,
AFAIR, but maybe you could fix it along this line:
WITH jobs_to_assign AS (
SELECT id FROM v_assignable_jobs
FOR UPDATE
), deleted_jobs AS (
DELETE FROM available_jobs
USING jobs_to_assign
WHERE available_jobs.id = jobs_to_assign.id
RETURNING available_jobs.*
)
INSERT ... about as before, but use deleted_jobs ...
> * Each transaction is in READ_COMMITTED.
Another line of thought is to use SERIALIZABLE mode and just retry
the inevitable serialization failures. However, if there are a lot
of concurrent processes doing this, you'd probably get so many
serialization failures that it'd be quite inefficient.
regards, tom lane