Thread: Need LIMIT and ORDER BY for UPDATE

Need LIMIT and ORDER BY for UPDATE

From
"D. Dante Lorenso"
Date:
All,

I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE
commands.  Is this possible?

   UPDATE invoice i
   SET reserve_ts = NOW() + '1 hour'::timestamp
   FROM account a
   WHERE a.acct_id = i.acct_id
   AND i.reserve_ts < NOW()
   AND a.status = 'A'
   AND i.is_paid IS FALSE
   ORDER BY i.create_ts ASC
   LIMIT 1
   RETURNING invoice_id;

This query would find JUST ONE invoice record which is not paid and
reserve the right to operate on the row using the 'reserve_ts' column
for all active accounts.  The one row would be the oldest invoice
matching the criteria.  Only that one row would be updated and the
invoice_id of the updated row (if any) would be returned.

Running a query like this over and over would pop just one record off
the queue and would guarantee an atomic reservation.

Similar syntax would be very useful for DELETE operations.  The idea is
that doing an UPDATE with RETURNING allows a get/set operation in a
single query.  Without the LIMIT and ORDER BY, I'm forced to reserve all
rows at once which my application doesn't want to handle like that.

Can something like what I want be added in a future version?  Ideas or
alternatives?  I don't see how I can rewrite this query as a single
statement any other way and get the same expectations.

-- Dante

Re: Need LIMIT and ORDER BY for UPDATE

From
Bill Moran
Date:
"D. Dante Lorenso" <dante@lorenso.com> wrote:
>
> All,
>
> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE
> commands.  Is this possible?
>
>    UPDATE invoice i
>    SET reserve_ts = NOW() + '1 hour'::timestamp
>    FROM account a
>    WHERE a.acct_id = i.acct_id
>    AND i.reserve_ts < NOW()
>    AND a.status = 'A'
>    AND i.is_paid IS FALSE
>    ORDER BY i.create_ts ASC
>    LIMIT 1
>    RETURNING invoice_id;
>
> This query would find JUST ONE invoice record which is not paid and
> reserve the right to operate on the row using the 'reserve_ts' column
> for all active accounts.  The one row would be the oldest invoice
> matching the criteria.  Only that one row would be updated and the
> invoice_id of the updated row (if any) would be returned.
>
> Running a query like this over and over would pop just one record off
> the queue and would guarantee an atomic reservation.

While I'm not going to argue as to whether your suggestion would be
a good idea or not, I will suggest you look at SELECT FOR UPDATE, which
will allow you to do what you desire.

--
Bill Moran
http://www.potentialtech.com

Re: Need LIMIT and ORDER BY for UPDATE

From
"John D. Burger"
Date:
D. Dante Lorenso wrote:

> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE
> commands.  Is this possible?
>
>   UPDATE invoice i
>   SET reserve_ts = NOW() + '1 hour'::timestamp
>   FROM account a
>   WHERE a.acct_id = i.acct_id
>   AND i.reserve_ts < NOW()
>   AND a.status = 'A'
>   AND i.is_paid IS FALSE
>   ORDER BY i.create_ts ASC
>   LIMIT 1
>   RETURNING invoice_id;
>
> This query would find JUST ONE invoice record which is not paid and
> reserve the right to operate on the row using the 'reserve_ts'
> column for all active accounts.  The one row would be the oldest
> invoice matching the criteria.  Only that one row would be updated
> and the invoice_id of the updated row (if any) would be returned.

> Can something like what I want be added in a future version?  Ideas
> or alternatives?  I don't see how I can rewrite this query as a
> single statement any other way and get the same expectations.

Doesn't this do it, assuming invoice_id is unique?

   UPDATE invoice
   SET reserve_ts = NOW() + '1 hour'::timestamp
   where invoice_id =
    (select invoice_id from invoice i,
        account a
    WHERE a.acct_id = i.acct_id
    AND i.reserve_ts < NOW()
    AND a.status = 'A'
    AND i.is_paid IS FALSE
    ORDER BY i.create_ts ASC
    LIMIT 1)
   RETURNING invoice_id;

- John Burger
   MITRE

Re: Need LIMIT and ORDER BY for UPDATE

From
"D. Dante Lorenso"
Date:
John D. Burger wrote:
> D. Dante Lorenso wrote:
>
>> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE
>> commands.  Is this possible?
>>
>>   UPDATE invoice i
>>   SET reserve_ts = NOW() + '1 hour'::timestamp
>>   FROM account a
>>   WHERE a.acct_id = i.acct_id
>>   AND i.reserve_ts < NOW()
>>   AND a.status = 'A'
>>   AND i.is_paid IS FALSE
>>   ORDER BY i.create_ts ASC
>>   LIMIT 1
>>   RETURNING invoice_id;
>>
>> This query would find JUST ONE invoice record which is not paid and
>> reserve the right to operate on the row using the 'reserve_ts' column
>> for all active accounts.  The one row would be the oldest invoice
>> matching the criteria.  Only that one row would be updated and the
>> invoice_id of the updated row (if any) would be returned.
>
>> Can something like what I want be added in a future version?  Ideas or
>> alternatives?  I don't see how I can rewrite this query as a single
>> statement any other way and get the same expectations.
>
> Doesn't this do it, assuming invoice_id is unique?
>
>   UPDATE invoice
>   SET reserve_ts = NOW() + '1 hour'::timestamp
>   where invoice_id =
>     (select invoice_id from invoice i,
>         account a
>     WHERE a.acct_id = i.acct_id
>     AND i.reserve_ts < NOW()
>     AND a.status = 'A'
>     AND i.is_paid IS FALSE
>     ORDER BY i.create_ts ASC
>     LIMIT 1)
>   RETURNING invoice_id;

Doesn't this create race condition in the query where multiple processes
might find the same invoice_id while executing the inner select.  The
update would then update the same record more than once during the
update step and 2 processes might get the same invoice_id returned.  In
otherwords, moving the select criteria into a sub-query breaks the
atomic nature of the update.  Right?

I have been trying to doing something like this, though:

   UPDATE invoice
   SET reserve_ts = NOW() + '1 hour'::timestamp
   WHERE reserve_ts < NOW()
   AND invoice_id = (
     SELECT invoice_id
     FROM invoice i, account a
     WHERE a.acct_id = i.acct_id
     AND i.reserve_ts < NOW()
     AND a.status = 'A'
     AND i.is_paid IS FALSE
     ORDER BY i.create_ts ASC
     LIMIT 1
   )
   RETURNING invoice_id;

By checking the reserve_ts inside the SELECT and again inside the UPDATE
this should catch the race condition and only allow one process to
perform the update on a given match.  If the other process has updated
the reserve_ts already, the reserve_ts would not pass the second check.
  However, the new side-effect is that one process would receive a NULL
return result when the race condition occurs rather than just picking up
the next queue invoice_id.

Unless I can get what I really want, this will have to do, I suppose.

-- Dante

Re: Need LIMIT and ORDER BY for UPDATE

From
"D. Dante Lorenso"
Date:
Bill Moran wrote:
> "D. Dante Lorenso" <dante@lorenso.com> wrote:
>> All,
>>
>> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE
>> commands.  Is this possible?
>>
>>    UPDATE invoice i
>>    SET reserve_ts = NOW() + '1 hour'::timestamp
>>    FROM account a
>>    WHERE a.acct_id = i.acct_id
>>    AND i.reserve_ts < NOW()
>>    AND a.status = 'A'
>>    AND i.is_paid IS FALSE
>>    ORDER BY i.create_ts ASC
>>    LIMIT 1
>>    RETURNING invoice_id;
>>
>> This query would find JUST ONE invoice record which is not paid and
>> reserve the right to operate on the row using the 'reserve_ts' column
>> for all active accounts.  The one row would be the oldest invoice
>> matching the criteria.  Only that one row would be updated and the
>> invoice_id of the updated row (if any) would be returned.
>>
>> Running a query like this over and over would pop just one record off
>> the queue and would guarantee an atomic reservation.
>
> While I'm not going to argue as to whether your suggestion would be
> a good idea or not, I will suggest you look at SELECT FOR UPDATE, which
> will allow you to do what you desire.

   UPDATE invoice
   SET reserve_ts = NOW() + '1 hour'::interval
   WHERE invoice_id = (
     SELECT invoice_id
     FROM invoice i, account a
     WHERE a.acct_id = i.acct_id
     AND i.reserve_ts < NOW()
     AND a.status = 'A'
     AND i.is_paid IS FALSE
     ORDER BY i.create_ts ASC
     LIMIT 1
     FOR UPDATE
   )
   RETURNING invoice_id;

Does this do the same thing while still remaining a single atomic query
that will guarantee no race conditions during the inner select/update?

   ERROR:  SELECT FOR UPDATE/SHARE is not allowed in subqueries

Guess not.

-- Dante


Re: Need LIMIT and ORDER BY for UPDATE

From
Bill Moran
Date:
In response to "D. Dante Lorenso" <dante@lorenso.com>:
> Bill Moran wrote:
> > "D. Dante Lorenso" <dante@lorenso.com> wrote:
> >> All,
> >>
> >> I'd really like to have ORDER BY and LIMIT for UPDATE and DELETE
> >> commands.  Is this possible?
> >>
> >>    UPDATE invoice i
> >>    SET reserve_ts = NOW() + '1 hour'::timestamp
> >>    FROM account a
> >>    WHERE a.acct_id = i.acct_id
> >>    AND i.reserve_ts < NOW()
> >>    AND a.status = 'A'
> >>    AND i.is_paid IS FALSE
> >>    ORDER BY i.create_ts ASC
> >>    LIMIT 1
> >>    RETURNING invoice_id;
> >>
> >> This query would find JUST ONE invoice record which is not paid and
> >> reserve the right to operate on the row using the 'reserve_ts' column
> >> for all active accounts.  The one row would be the oldest invoice
> >> matching the criteria.  Only that one row would be updated and the
> >> invoice_id of the updated row (if any) would be returned.
> >>
> >> Running a query like this over and over would pop just one record off
> >> the queue and would guarantee an atomic reservation.
> >
> > While I'm not going to argue as to whether your suggestion would be
> > a good idea or not, I will suggest you look at SELECT FOR UPDATE, which
> > will allow you to do what you desire.
>
>    UPDATE invoice
>    SET reserve_ts = NOW() + '1 hour'::interval
>    WHERE invoice_id = (
>      SELECT invoice_id
>      FROM invoice i, account a
>      WHERE a.acct_id = i.acct_id
>      AND i.reserve_ts < NOW()
>      AND a.status = 'A'
>      AND i.is_paid IS FALSE
>      ORDER BY i.create_ts ASC
>      LIMIT 1
>      FOR UPDATE
>    )
>    RETURNING invoice_id;
>
> Does this do the same thing while still remaining a single atomic query
> that will guarantee no race conditions during the inner select/update?
>
>    ERROR:  SELECT FOR UPDATE/SHARE is not allowed in subqueries
>
> Guess not.

BEGIN;
SELECT invoice_id
      FROM invoice i, account a
      WHERE a.acct_id = i.acct_id
      AND i.reserve_ts < NOW()
      AND a.status = 'A'
      AND i.is_paid IS FALSE
      ORDER BY i.create_ts ASC
      LIMIT 1
      FOR UPDATE;
UPDATE invoice
      SET reserve_ts = NOW() + '1 hour'::interval
      WHERE invoice_id = [previously selected value];
COMMIT;

And before you start asking a lot of "won't this x or y", please read
the docs:
http://www.postgresql.org/docs/8.1/static/sql-select.html#SQL-FOR-UPDATE-SHARE

Then feel free to ask more questions.

--
Bill Moran
http://www.potentialtech.com

Re: Need LIMIT and ORDER BY for UPDATE

From
"John D. Burger"
Date:
D. Dante Lorenso wrote:

> Doesn't this create race condition in the query where multiple
> processes might find the same invoice_id while executing the inner
> select.  The update would then update the same record more than
> once during the update step and 2 processes might get the same
> invoice_id returned.  In otherwords, moving the select criteria
> into a sub-query breaks the atomic nature of the update.  Right?

Hmm, dunno.  Sorry, my grasp of concurrency issues is still infantile.

> I have been trying to doing something like this, though:

...

> By checking the reserve_ts inside the SELECT and again inside the
> UPDATE this should catch the race condition and only allow one
> process to perform the update on a given match.  If the other
> process has updated the reserve_ts already, the reserve_ts would
> not pass the second check.  However, the new side-effect is that
> one process would receive a NULL return result when the race
> condition occurs rather than just picking up the next queue
> invoice_id.

But this could happen in any event, if there are no more invoices to
process, yes?  I'm picturing a set of queue consumers, each of which
is already looping around such issues, anyway.

- John D. Burger
   MITRE