Thread: Chunk Delete
Hi, I am wondering if I can do in PG Chunck Delete, like the Oracle example below. In Oracle we erase 50,000 records using the following: Delete <table name> where <condition> and ROWNUM < 50000; Do we have such a feature in PG? Thanks Danny
On 11/15/07, Abraham, Danny <danny_abraham@bmc.com> wrote: > In Oracle we erase 50,000 records using the following: > > Delete <table name> where <condition> and ROWNUM < 50000; > > Do we have such a feature in PG? Just delete with a subselect: delete from <table name> where <primary key> in ( select <primary key> from <table name> order by <some sort key> limit 50000); For example: delete from persons where id in ( select id from persons order by created_at limit 50000); Alexander.
"Abraham, Danny" <danny_abraham@bmc.com> writes: > Hi, > > I am wondering if I can do in PG Chunck Delete, like the Oracle example > below. > > In Oracle we erase 50,000 records using the following: > > Delete <table name> where <condition> and ROWNUM < 50000; > > Do we have such a feature in PG? You can still use a subquery even if you don't have a primary key. You use the "ctid" which is Postgres's notion of the physical location of the record. You do have to use another non-standard extension DELETE ... USING. DELETE FROM atable AS x USING (SELECT ctid FROM atable LIMIT 50000) AS y WHERE x.ctid = y.ctid; (The ctid can change when it's updated and even be reused by another record but within a single query like this it's safe though it's possible the query will delete fewer than 50000 records though if one of the records is updated while the delete is running.) You should note this will delete 50,000 arbitrary records. Not necessarily the 50,000 oldest ones or anything useful. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
On Thu, 2007-11-15 at 17:13 +0000, Gregory Stark wrote: > DELETE > FROM atable AS x > USING (SELECT ctid FROM atable LIMIT 50000) AS y > WHERE x.ctid = y.ctid; Have you tried to EXPLAIN this one ? Last time I tried to do something similar it was going for a sequential scan on atable with a filter on ctid. The other form using "where ctid = any (array(select ctid from ..." (see my previous post forwarding Tom's suggestion) was going for a ctid scan, which should be orders of magnitudes faster than the sequential scan for big tables and small chunks. Cheers, Csaba.