Thread: [HACKERS] DELETE and UPDATE with LIMIT and ORDER BY
the necessity of allowing limit and order by clause to be used with delete and
update statement is discussed in the past and added to the todo list
preveouse mailing list descissions
i attached a small patch for its implementation.
Notice : inorder to avoid unpredictable result the patch did not allow limit clause without order by and vise versal.
comment please?
Regareds
Surafel
Attachment
Hi Surafel,
IIUC, the requirement of the feature also had one of the consideration where
one needs to delete large data and that takes long time, and adding LIMIT
should reduce the overhead by allowing to delete the data in batches.
I did a quick performance test, and in following example you can see the
conventional delete taking "355.288 ms" VS "1137.248 ms" with new LIMIT
syntax.
postgres=# create table mytab(a int, b varchar(50));
CREATE TABLE
postgres=# insert into mytab(a, b)
select i,md5(random()::text) from generate_series(1, 1000000) s(i);
INSERT 0 1000000
postgres=# \timing
Timing is on.
postgres=# delete from mytab order by a limit 200000 offset 0;
DELETE 200000
Time: 1137.248 ms (00:01.137)
postgres=# truncate mytab;
TRUNCATE TABLE
Time: 21.717 ms
postgres=# insert into mytab(a, b)
select i,md5(random()::text) from generate_series(1, 1000000) s(i);
INSERT 0 1000000
Time: 3166.445 ms (00:03.166)
postgres=# delete from mytab where a < 200001;
DELETE 200000
Time: 355.288 ms
Am I missing something here?
Regards,
Jeevan Ladhe
On Mon, Apr 24, 2017 at 8:09 AM, Surafel Temesgen <surafel3000@gmail.com> wrote:
the necessity of allowing limit and order by clause to be used with delete andupdate statement is discussed in the past and added to the todo listpreveouse mailing list descissions
See this more recent one:
That patch was not adopted, as I recall, mostly due to the requirement that it support partitioned tables.
i attached a small patch for its implementation.Notice : inorder to avoid unpredictable result the patch did not allow limit clause without order by and vise versal.
I think both of those are ill-advised. To avoid deadlock, it is perfectly fine to want an order by without a limit.
And to facilitate the reorganization of partitions or the population of new columns in bite-size chunks, it is also fine to want limit without order by.
Cheers,
Jeff
On Tue, Apr 25, 2017 at 2:42 AM, Jeff Janes <jeff.janes@gmail.com> wrote: > On Mon, Apr 24, 2017 at 8:09 AM, Surafel Temesgen <surafel3000@gmail.com> > wrote: >> >> the necessity of allowing limit and order by clause to be used with delete >> and >> update statement is discussed in the past and added to the todo list >> >> preveouse mailing list descissions >> >> http://archives.postgresql.org/pgadmin-hackers/2010-04/msg00078.php >> http://archives.postgresql.org/pgsql-hackers/2010-11/msg01997.php > > > See this more recent one: > > https://www.postgresql.org/message-id/flat/54102581.2020207%40joh.to#54102581.2020207@joh.to > > That patch was not adopted, as I recall, mostly due to the requirement that > it support partitioned tables. +1. The discussion there applies to inheritance based as well as declarative partitioning, although Tom Lane thought that declarative partitioning would not require special handling. IIUC, the patch is simply pushing the LIMIT down into the underlying scan. If the scan returns LIMIT number of rows, but some of those rows were changed by a concurrent update, such that the new version doesn't fit the filters, it will end up updating less that LIMIT number of rows. I think that's not expected I guess. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company