Re: LIMIT for UPDATE and DELETE - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: LIMIT for UPDATE and DELETE
Date
Msg-id CAMkU=1w8B+y9ap9KywxBEXpLd9vwkN40ACqNjxHgC8=60aSyTg@mail.gmail.com
Whole thread Raw
In response to Re: LIMIT for UPDATE and DELETE  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
On Sun, Aug 24, 2014 at 11:48 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
Hi Rukh,


(2014/08/15 6:18), Rukh Meski wrote:
Based on the feedback on my previous patch, I've separated only the
LIMIT part into its own feature.  This version plays nicely with
inheritance.  The intended use is splitting up big UPDATEs and DELETEs
into batches more easily and efficiently.

Before looking into the patch, I'd like to know the use cases in more details.


There are two common use cases I can think of:

1) 

I've just added a column to an existing table, and it is all NULL.  I've changed the code to populate that column appropriately for new or updated rows, but I need to back fill the existing rows.  I have a (slow) method to compute the new value.  (I've not yet changed the code to depend on that column being populated)

The obvious solution is:

update the_table set new_col=populate_new_col(whatever) where new_col is null.

But this will bloat the table because vacuum cannot intervene, and will take a very long time.  The first row to be update will remain locked until the last row gets updated, which is not acceptable.  And if something goes wrong before the commit, you've lost all the work.

With the limit clause, you can just do this:

update the_table set new_col=populate_new_col(whatever) where new_col is null limit 50000;

In a loop with appropriate vacuuming and throttling.

2) 

I've introduced or re-designed partitioning, and need to migrate rows to the appropriate partitions without long lived row locks.

create table pgbench_accounts2 () inherits (pgbench_accounts);

and then in a loop:

with t as (delete from only pgbench_accounts where aid < 500000 limit 5000 returning *) 
  insert into pgbench_accounts2 select * from t;

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: psql \watch versus \timing
Next
From: Alvaro Herrera
Date:
Subject: Re: Specifying the unit in storage parameter