Thread: DELETE using an outer join

DELETE using an outer join

From
Thomas Kellerer
Date:
Hi,

(this is not a real world problem, just something I'm playing around with).

Lately I had some queries of the form:
   select t.*   from some_table t   where t.id not in (select some_id from some_other_table);

I could improve the performance of them drastically by changing the NOT NULL into an outer join:
   select t.*   from some_table t      left join some_other_table ot on ot.id = t.id   where ot.id is null;


Now I was wondering if a DELETE statement could be rewritten with the same "strategy":

Something like:
   delete from some_table   where id not in (select min(id)                     from some_table
groupby col1, col2                    having count(*) > 1);
 

(It's the usual - at least for me - "get rid of duplicates" statement)


The DELETE .. USING seems to only allow inner joins because it requires the join to be done in the WHERE clause.
So I can't think of a way to turn that NOT IN from the DELETE into an outer join with a derived table.

Am I right that this kind of transformation is not possible or am I missing something?

Regards
Thomas



Re: DELETE using an outer join

From
Sergey Konoplev
Date:
On Thu, Jul 19, 2012 at 4:43 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:
>    delete from some_table
>    where id not in (select min(id)
>                      from some_table
>                     group by col1, col2
>                     having count(*) > 1);
>
> (It's the usual - at least for me - "get rid of duplicates" statement)

If you want to remove duplicates you can do it this way.

DELETE FROM some_table USING some_table AS s
WHERE   some_table.col1 = s.col1 AND   some_table.col2 = s.col2 AND   some_table.id < s.id;

The query plan should be better than one with the sub query and NOT IN.

ps. May be this example is worth to append to the documentation?

-- 
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204


Re: DELETE using an outer join

From
Tom Lane
Date:
Thomas Kellerer <spam_eater@gmx.net> writes:
> Lately I had some queries of the form:

>     select t.*
>     from some_table t
>     where t.id not in (select some_id from some_other_table);

> I could improve the performance of them drastically by changing the NOT NULL into an outer join:

>     select t.*
>     from some_table t
>        left join some_other_table ot on ot.id = t.id
>     where ot.id is null;

If you're using a reasonably recent version of PG, replacing the NOT IN
by a NOT EXISTS test should also help.

> Now I was wondering if a DELETE statement could be rewritten with the same "strategy":

Not at the moment.  There have been discussions of allowing the same
table name to be respecified in USING, but there are complications.
        regards, tom lane


Re: DELETE using an outer join

From
Thomas Kellerer
Date:
Tom Lane, 19.07.2012 16:52:
> If you're using a reasonably recent version of PG, replacing the NOT IN
> by a NOT EXISTS test should also help.

Thanks. I wasn't aware of that (and the NOT EXISTS does indeed produce the same plan as the OUTER JOIN solution)


>> Now I was wondering if a DELETE statement could be rewritten with the same "strategy":
>
> Not at the moment.  There have been discussions of allowing the same
> table name to be respecified in USING, but there are complications.

Thanks as well. It's not a big deal for me. I was just curious if I missed something.

Regards
Thomas





Re: DELETE using an outer join

From
Sergey Konoplev
Date:
On Thu, Jul 19, 2012 at 6:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Now I was wondering if a DELETE statement could be rewritten with the same "strategy":
>
> Not at the moment.  There have been discussions of allowing the same
> table name to be respecified in USING, but there are complications.

However it works.

DELETE FROM some_table USING some_table AS s
WHERE   some_table.col1 = s.col1 AND   some_table.col2 = s.col2 AND   some_table.id < s.id;

-- 
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204


Re: DELETE using an outer join

From
Thomas Kellerer
Date:
Sergey Konoplev, 20.07.2012 10:21:
> On Thu, Jul 19, 2012 at 6:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Now I was wondering if a DELETE statement could be rewritten with the same "strategy":
>>
>> Not at the moment.  There have been discussions of allowing the same
>> table name to be respecified in USING, but there are complications.
>
> However it works.
>
> DELETE FROM some_table USING some_table AS s
> WHERE
>      some_table.col1 = s.col1 AND
>      some_table.col2 = s.col2 AND
>      some_table.id < s.id;
>
But that's not an outer join





Re: DELETE using an outer join

From
Sergey Konoplev
Date:
On Fri, Jul 20, 2012 at 2:27 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:
>>>> Now I was wondering if a DELETE statement could be rewritten with the
>>>> same "strategy":
>>>
>>> Not at the moment.  There have been discussions of allowing the same
>>> table name to be respecified in USING, but there are complications.
>>
>> However it works.
>>
>> DELETE FROM some_table USING some_table AS s
>> WHERE
>>      some_table.col1 = s.col1 AND
>>      some_table.col2 = s.col2 AND
>>      some_table.id < s.id;
>>
>
> But that's not an outer join

Oh, yes. I just lost the discussion line. Sorry.

-- 
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204


Re: DELETE using an outer join

From
Tom Lane
Date:
Sergey Konoplev <gray.ru@gmail.com> writes:
> On Thu, Jul 19, 2012 at 6:52 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Now I was wondering if a DELETE statement could be rewritten with the same "strategy":

>> Not at the moment.  There have been discussions of allowing the same
>> table name to be respecified in USING, but there are complications.

> However it works.

> DELETE FROM some_table USING some_table AS s
> WHERE
>     some_table.col1 = s.col1 AND
>     some_table.col2 = s.col2 AND
>     some_table.id < s.id;

No, that's a self-join, which isn't what the OP wanted.  You can make it
work if you self-join on the primary key and then left join to the other
table, but that's pretty klugy and inefficient.

What was being discussed is allowing people to write directly

DELETE FROM some_table USING some_table LEFT JOIN other_table ...

where the respecification of the table in USING would be understood
to mean the target table.  Right now this is an error case because
of duplicate table aliases.
        regards, tom lane


Re: DELETE using an outer join

From
Sergey Konoplev
Date:
On Fri, Jul 20, 2012 at 5:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> DELETE FROM some_table USING some_table AS s
>> WHERE
>>     some_table.col1 = s.col1 AND
>>     some_table.col2 = s.col2 AND
>>     some_table.id < s.id;
>
> No, that's a self-join, which isn't what the OP wanted.  You can make it
> work if you self-join on the primary key and then left join to the other
> table, but that's pretty klugy and inefficient.
>
> What was being discussed is allowing people to write directly
>
> DELETE FROM some_table USING some_table LEFT JOIN other_table ...
>
> where the respecification of the table in USING would be understood
> to mean the target table.  Right now this is an error case because
> of duplicate table aliases.

Yes, the OP has already pointed me to it. Thank you for your explanation.

-- 
Sergey Konoplev

a database architect, software developer at PostgreSQL-Consulting.com
http://www.postgresql-consulting.com

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204