Thread: [HACKERS] DELETE and UPDATE with LIMIT and ORDER BY

[HACKERS] DELETE and UPDATE with LIMIT and ORDER BY

From
Surafel Temesgen
Date:
the necessity of allowing limit and order by clause to be used with delete and 
update statement is discussed in the past and added to the todo list 

preveouse mailing list descissions
 

i attached a small patch for its implementation.

Notice : inorder to avoid unpredictable result the patch did not allow limit clause without order by and vise versal.

comment please?

Regareds 
 
Surafel
Attachment

Re: [HACKERS] DELETE and UPDATE with LIMIT and ORDER BY

From
Jeevan Ladhe
Date:
Hi Surafel,

IIUC, the requirement of the feature also had one of the consideration where
one needs to delete large data and that takes long time, and adding LIMIT
should reduce the overhead by allowing to delete the data in batches.

I did a quick performance test, and in following example you can see the
conventional delete taking "355.288 ms" VS "1137.248 ms" with new LIMIT
syntax.

postgres=# create table mytab(a int, b varchar(50));
CREATE TABLE
postgres=# insert into mytab(a, b)
select i,md5(random()::text) from generate_series(1, 1000000) s(i);
INSERT 0 1000000
postgres=# \timing 
Timing is on.
postgres=# delete from mytab order by a limit 200000 offset 0;
DELETE 200000
Time: 1137.248 ms (00:01.137)
postgres=# truncate mytab;
TRUNCATE TABLE
Time: 21.717 ms
postgres=# insert into mytab(a, b)
select i,md5(random()::text) from generate_series(1, 1000000) s(i);
INSERT 0 1000000
Time: 3166.445 ms (00:03.166)
postgres=# delete from mytab where a < 200001;
DELETE 200000
Time: 355.288 ms

Am I missing something here?

Regards,
Jeevan Ladhe

Re: [HACKERS] DELETE and UPDATE with LIMIT and ORDER BY

From
Jeff Janes
Date:
On Mon, Apr 24, 2017 at 8:09 AM, Surafel Temesgen <surafel3000@gmail.com> wrote:
the necessity of allowing limit and order by clause to be used with delete and 
update statement is discussed in the past and added to the todo list 

preveouse mailing list descissions
 

See this more recent one:


That patch was not adopted, as I recall, mostly due to the requirement that it support partitioned tables.
 
i attached a small patch for its implementation.

Notice : inorder to avoid unpredictable result the patch did not allow limit clause without order by and vise versal.

I think both of those are ill-advised.  To avoid deadlock, it is perfectly fine to want an order by without a limit.

And to facilitate the reorganization of partitions or the population of new columns in bite-size chunks, it is also fine to want limit without order by.

Cheers,

Jeff

Re: [HACKERS] DELETE and UPDATE with LIMIT and ORDER BY

From
Ashutosh Bapat
Date:
On Tue, Apr 25, 2017 at 2:42 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Mon, Apr 24, 2017 at 8:09 AM, Surafel Temesgen <surafel3000@gmail.com>
> wrote:
>>
>> the necessity of allowing limit and order by clause to be used with delete
>> and
>> update statement is discussed in the past and added to the todo list
>>
>> preveouse mailing list descissions
>>
>>  http://archives.postgresql.org/pgadmin-hackers/2010-04/msg00078.php
>>  http://archives.postgresql.org/pgsql-hackers/2010-11/msg01997.php
>
>
> See this more recent one:
>
> https://www.postgresql.org/message-id/flat/54102581.2020207%40joh.to#54102581.2020207@joh.to
>
> That patch was not adopted, as I recall, mostly due to the requirement that
> it support partitioned tables.

+1. The discussion there applies to inheritance based as well as
declarative partitioning, although Tom Lane thought that declarative
partitioning would not require special handling.

IIUC, the patch is simply pushing the LIMIT down into the underlying
scan. If the scan returns LIMIT number of rows, but some of those rows
were changed by a concurrent update, such that the new version doesn't
fit the filters, it will end up updating less that LIMIT number of
rows. I think that's not expected I guess.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company