Thread: DELETE syntax on JOINS
Dear Friends, First, thank you very much for considering a fix on the GROUP BY issue. I am starting a new thread about another issue: It seems that DELETE cannot understand INNER JOINS and needs HAVING. Read: http://drupal.org/node/555562 (main message) http://drupal.org/node/555648 I don't see why PostgreSQL would not be able to run queries like: DELETE h FROM history AS h INNER JOIN term_node AS tn ON (h.nid = tn.nid) INNER JOIN term_data AS td ON (td.tid = tn.tid) WHERE h.uid = 2067 AND td.vid = 2 Ultimately, why not allow: DELETE h, tn FROM history AS h INNER JOIN term_node AS tn ON (h.nid = tn.nid) INNER JOIN term_data AS td ON (td.tid = tn.tid) WHERE h.uid = 2067 AND td.vid = 2 IMHO this would improve compliance towards other database systems. To me this seems to be in the reasonable scope of compatibility. Kind regards, Jean-Michel
Jean-Michel Pour� wrote: -- Start of PGP signed section. > Dear Friends, > > First, thank you very much for considering a fix on the GROUP BY issue. > I am starting a new thread about another issue: > > It seems that DELETE cannot understand INNER JOINS and needs HAVING. > > Read: > http://drupal.org/node/555562 (main message) > http://drupal.org/node/555648 > > I don't see why PostgreSQL would not be able to run queries like: > > DELETE h > FROM history AS h > INNER JOIN term_node AS tn ON (h.nid = tn.nid) > INNER JOIN term_data AS td ON (td.tid = tn.tid) > WHERE h.uid = 2067 AND td.vid = 2 > > Ultimately, why not allow: > > DELETE h, tn > FROM history AS h > INNER JOIN term_node AS tn ON (h.nid = tn.nid) > INNER JOIN term_data AS td ON (td.tid = tn.tid) > WHERE h.uid = 2067 AND td.vid = 2 > > IMHO this would improve compliance towards other database systems. To me > this seems to be in the reasonable scope of compatibility. Which "other database systems"? Only MySQL? If it is MySQL-only, we are unlikely to add it. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > > Ultimately, why not allow: > > > > DELETE h, tn > > FROM history AS h > > INNER JOIN term_node AS tn ON (h.nid = tn.nid) > > INNER JOIN term_data AS td ON (td.tid = tn.tid) > > WHERE h.uid = 2067 AND td.vid = 2 > > > > IMHO this would improve compliance towards other database systems. To me > > this seems to be in the reasonable scope of compatibility. > > Which "other database systems"? Only MySQL? If it is MySQL-only, we > are unlikely to add it. The SQL standard does not support this syntax. They would have you put the joins in a subselect (which is often not enough because then you can't use outer joins). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Bruce Momjian wrote: > > > > Ultimately, why not allow: > > > > > > DELETE h, tn > > > FROM history AS h > > > INNER JOIN term_node AS tn ON (h.nid = tn.nid) > > > INNER JOIN term_data AS td ON (td.tid = tn.tid) > > > WHERE h.uid = 2067 AND td.vid = 2 > > > > > > IMHO this would improve compliance towards other database systems. To me > > > this seems to be in the reasonable scope of compatibility. > > > > Which "other database systems"? Only MySQL? If it is MySQL-only, we > > are unlikely to add it. > > The SQL standard does not support this syntax. They would have you put > the joins in a subselect (which is often not enough because then you > can't use outer joins). So the problem is that our DELETE ... USING does not allow ANSI join syntax? Can that be added? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Alvaro Herrera wrote: > > Bruce Momjian wrote: > > > > > > Ultimately, why not allow: > > > > > > > > DELETE h, tn > > > > FROM history AS h > > > > INNER JOIN term_node AS tn ON (h.nid = tn.nid) > > > > INNER JOIN term_data AS td ON (td.tid = tn.tid) > > > > WHERE h.uid = 2067 AND td.vid = 2 > > > > > > > > IMHO this would improve compliance towards other database systems. To me > > > > this seems to be in the reasonable scope of compatibility. > > > > > > Which "other database systems"? Only MySQL? If it is MySQL-only, we > > > are unlikely to add it. > > > > The SQL standard does not support this syntax. They would have you put > > the joins in a subselect (which is often not enough because then you > > can't use outer joins). > > So the problem is that our DELETE ... USING does not allow ANSI join > syntax? Can that be added? Not sure about that. USING is already an extension to the standard, so if we extend it a bit more, it can't be a problem, can it? But this doesn't solve Jean Michel's problem, because MySQL does not support DELETE USING (or does it?). -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Bruce Momjian wrote: > > Alvaro Herrera wrote: > > > Bruce Momjian wrote: > > > > > > > > Ultimately, why not allow: > > > > > > > > > > DELETE h, tn > > > > > FROM history AS h > > > > > INNER JOIN term_node AS tn ON (h.nid = tn.nid) > > > > > INNER JOIN term_data AS td ON (td.tid = tn.tid) > > > > > WHERE h.uid = 2067 AND td.vid = 2 > > > > > > > > > > IMHO this would improve compliance towards other database systems. To me > > > > > this seems to be in the reasonable scope of compatibility. > > > > > > > > Which "other database systems"? Only MySQL? If it is MySQL-only, we > > > > are unlikely to add it. > > > > > > The SQL standard does not support this syntax. They would have you put > > > the joins in a subselect (which is often not enough because then you > > > can't use outer joins). > > > > So the problem is that our DELETE ... USING does not allow ANSI join > > syntax? Can that be added? > > Not sure about that. USING is already an extension to the standard, so > if we extend it a bit more, it can't be a problem, can it? But this > doesn't solve Jean Michel's problem, because MySQL does not support > DELETE USING (or does it?). Right, but if we support ANSI joins in the USING clause, at least we would have a _functional_ equivalent, which we don't know because of missing outer join support. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
In response to Bruce Momjian <bruce@momjian.us>: > Alvaro Herrera wrote: > > Bruce Momjian wrote: > > > > > > Ultimately, why not allow: > > > > > > > > DELETE h, tn > > > > FROM history AS h > > > > INNER JOIN term_node AS tn ON (h.nid = tn.nid) > > > > INNER JOIN term_data AS td ON (td.tid = tn.tid) > > > > WHERE h.uid = 2067 AND td.vid = 2 > > > > > > > > IMHO this would improve compliance towards other database systems. To me > > > > this seems to be in the reasonable scope of compatibility. > > > > > > Which "other database systems"? Only MySQL? If it is MySQL-only, we > > > are unlikely to add it. > > > > The SQL standard does not support this syntax. They would have you put > > the joins in a subselect (which is often not enough because then you > > can't use outer joins). > > So the problem is that our DELETE ... USING does not allow ANSI join > syntax? Can that be added? I suspect that the reason MySQL has this syntax is because for a long time they didn't have proper foreign keys and referential integrity. With proper foreign keys and ON DELETE CASCADE, why would supporting such syntax even be necessary? -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/
Alvaro Herrera <alvherre@commandprompt.com> writes: > Bruce Momjian wrote: >> So the problem is that our DELETE ... USING does not allow ANSI join >> syntax? Can that be added? > Not sure about that. USING is already an extension to the standard, so > if we extend it a bit more, it can't be a problem, can it? I don't see any very good way to extend the USING syntax to allow the target table to be outer-joined to something else. Some other systems allow it by letting you re-specify the target in the other clause, equivalently to DELETE FROM target t USING t LEFT JOIN other_table ot ON ... but we have always considered that the target is *not* to be identified with any member of the FROM/USING clause, so it would be a serious compatibility break to change that now. regards, tom lane
2009/8/24 Tom Lane <tgl@sss.pgh.pa.us>: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Bruce Momjian wrote: >>> So the problem is that our DELETE ... USING does not allow ANSI join >>> syntax? Can that be added? > >> Not sure about that. USING is already an extension to the standard, so >> if we extend it a bit more, it can't be a problem, can it? > > I don't see any very good way to extend the USING syntax to allow the > target table to be outer-joined to something else. Some other systems > allow it by letting you re-specify the target in the other clause, > equivalently to > > DELETE FROM target t USING t LEFT JOIN other_table ot ON ... > > but we have always considered that the target is *not* to be identified > with any member of the FROM/USING clause, so it would be a serious > compatibility break to change that now. I'm all in favor of compatibility, but if there is any way to make this work without massive collateral damage, I am also all in favor of that. I am forever writing queries that contain a needless self-join to work around the impossibility of directly outer-joining against the target. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > 2009/8/24 Tom Lane <tgl@sss.pgh.pa.us>: >> ... Some other systems >> allow it by letting you re-specify the target in the other clause, >> equivalently to >> >> DELETE FROM target t USING t LEFT JOIN other_table ot ON ... >> >> but we have always considered that the target is *not* to be identified >> with any member of the FROM/USING clause, so it would be a serious >> compatibility break to change that now. > I'm all in favor of compatibility, but if there is any way to make > this work without massive collateral damage, I am also all in favor of > that. I am forever writing queries that contain a needless self-join > to work around the impossibility of directly outer-joining against the > target. It'd be pretty easy to do if we were willing to introduce a new reserved word; for example DELETE FROM target t USING SELF LEFT JOIN other_table ot ON ... (or maybe TARGET instead of SELF, or some other word). Wouldn't do anything for exact compatibility with MySQL or anybody else using the respecify-the-target-table-name approach. But it would be unambiguous and backwards-compatible. The real problem with this is that all the good candidates for the reserved word are things people are probably already using as aliases, so we'd have a large risk of breaking existing queries. We could avoid that with a sufficiently ugly choice like DELETE FROM target t USING DELETE_TARGET LEFT JOIN other_table ot ON ... but yech ... regards, tom lane
On Mon, Aug 24, 2009 at 01:41:28PM -0400, Tom Lane wrote: > The real problem with this is that all the > good candidates for the reserved word are things people are probably > already using as aliases, so we'd have a large risk of breaking existing > queries. We could avoid that with a sufficiently ugly choice like > > DELETE FROM target t USING DELETE_TARGET LEFT JOIN other_table ot ON ... > > but yech ... PRIMARY or TABLE? Both are pretty grim, but I think they're reserved at the moment. -- Sam http://samason.me.uk/
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Bruce Momjian wrote: > >> So the problem is that our DELETE ... USING does not allow ANSI join > >> syntax? Can that be added? > > > Not sure about that. USING is already an extension to the standard, so > > if we extend it a bit more, it can't be a problem, can it? > > I don't see any very good way to extend the USING syntax to allow the > target table to be outer-joined to something else. Some other systems > allow it by letting you re-specify the target in the other clause, > equivalently to > > DELETE FROM target t USING t LEFT JOIN other_table ot ON ... > > but we have always considered that the target is *not* to be identified > with any member of the FROM/USING clause, so it would be a serious > compatibility break to change that now. Let's look at this a little closer. We can use an alias in the DELETE FROM clause: test=> DELETE FROM test t;test=> DELETE FROM test t USING test; What we cannot currently do is reference test twice: test=> DELETE FROM test USING test;ERROR: table name "test" specified more than oncetest=> DELETE FROM test t USING testt;ERROR: table name "t" specified more than once As far as I understand it, allowing ANSI joins in USING would simple mean removing that error message and linking the two table aliases. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > What we cannot currently do is reference test twice: > test=> DELETE FROM test USING test; > ERROR: table name "test" specified more than once > test=> DELETE FROM test t USING test t; > ERROR: table name "t" specified more than once Hmm, I had forgotten that we throw errors in these cases now. Maybe that *would* give us an escape-hatch for the other interpretation. > As far as I understand it, allowing ANSI joins in USING would simple > mean removing that error message and linking the two table aliases. Well, you'd still need to complain about DELETE FROM test USING test JOIN test ON ... Also, it's not nearly as easy as just removing the error check. There's stuff in the planner (and perhaps executor) that's dependent on the assumption that the target table isn't on the inside of an outer join, for example. Still, getting agreement on a syntax would in itself be a huge step forward. regards, tom lane
All, >> DELETE FROM target t USING t LEFT JOIN other_table ot ON ... >> >> but we have always considered that the target is *not* to be identified >> with any member of the FROM/USING clause, so it would be a serious >> compatibility break to change that now. What I don't get is why this is such a usability issue. Subqueries in DELETE FROM work perfectly well, and provide more flexibility than most users know what to do with. Personally, I'd be happy just to stop with the SQL extension we have. I think extending USING any further is going to cause more problems than it solves. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
> With proper foreign keys and ON DELETE CASCADE, why would supporting > such syntax even be necessary? Porting existing abstraction layers from ANSI JOINs to ON DELETE CASCADE is complicated. > What I don't get is why this is such a usability issue. Subqueries in > DELETE FROM work perfectly well, and provide more flexibility than > most > users know what to do with. The ANSI syntax allows deleting one or several tables at once. Subqueries are not supported by MySQL on DELETE. Again, this is a usability issue to gain market shares and happy users againts MySQL. Kind regards, Jean-Michel
Josh Berkus wrote: > All, > > >> DELETE FROM target t USING t LEFT JOIN other_table ot ON ... > >> > >> but we have always considered that the target is *not* to be identified > >> with any member of the FROM/USING clause, so it would be a serious > >> compatibility break to change that now. > > What I don't get is why this is such a usability issue. Subqueries in > DELETE FROM work perfectly well, and provide more flexibility than most > users know what to do with. OK, so you are saying that every OUTER join can be efficiently reprsented as a subquery? If that is true we don't need to add ANSI join support to USING. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Mon, Aug 24, 2009 at 9:31 PM, Josh Berkus<josh@agliodbs.com> wrote: > All, > >>> DELETE FROM target t USING t LEFT JOIN other_table ot ON ... >>> >>> but we have always considered that the target is *not* to be identified >>> with any member of the FROM/USING clause, so it would be a serious >>> compatibility break to change that now. > > What I don't get is why this is such a usability issue. Subqueries in > DELETE FROM work perfectly well, and provide more flexibility than most > users know what to do with. > > Personally, I'd be happy just to stop with the SQL extension we have. I > think extending USING any further is going to cause more problems than > it solves. It's both a usability issue and a performance issue. Suppose you want to select all the rows in foo whose id field does not appear in bar.foo_id. The most efficient way to do this in PostgreSQL is typically: SELECT foo.* FROM foo LEFT JOIN bar ON foo.id = bar.foo_id WHERE bar.foo_id IS NULL; Now, if you want to delete those rows, you can't do it without an extra join somewhere. You can do it like this: DELETE FROM foo AS foo1 USING foo AS foo2 LEFT JOIN bar ON foo2.id = bar.foo_id WHERE foo1.id = foo2.id AND foo2; Or like this: DELETE FROM foo WHERE id IN (SELECT foo.id FROM foo LEFT JOIN bar ON foo.id = bar.foo_id WHERE bar.foo_id IS NULL); ...but either way you now have foo in there twice when it really shouldn't need to be, and you're doing a useless self-join to work around a syntax limitation. [ thinks ] Actually, I guess in this case you can get around it like this: DELETE FROM foo WHERE NOT EXISTS (SELECT 1 FROM bar WHERE bar.foo_id = foo.id); ...but I'm not sure it can be rewritten that way in every case - in particular, that won't work if you have a RETURNING clause that includes a value taken from bar. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Aug 24, 2009 at 9:31 PM, Josh Berkus<josh@agliodbs.com> wrote: >> What I don't get is why this is such a usability issue. �Subqueries in >> DELETE FROM work perfectly well, and provide more flexibility than most >> users know what to do with. > It's both a usability issue and a performance issue. On the usability front: if we were to take the position Josh advocates, we should never have added FROM/USING to UPDATE/DELETE at all ... but since we did, I think we should try to make it as flexible as the corresponding feature in other DBMSes. On the performance front: yeah, you can recast most joins as subqueries, but you tend to end up with the equivalent of a nestloop plan. Works okay for small numbers of rows, scales horribly. regards, tom lane
Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > >> On Mon, Aug 24, 2009 at 9:31 PM, Josh Berkus<josh@agliodbs.com> wrote: >> >>> What I don't get is why this is such a usability issue. Subqueries in >>> DELETE FROM work perfectly well, and provide more flexibility than most >>> users know what to do with. >>> > > >> It's both a usability issue and a performance issue. >> > > On the usability front: if we were to take the position Josh advocates, > we should never have added FROM/USING to UPDATE/DELETE at all FWIW, I use update ... from a lot - it make many update queries easier and less error prone to write - but I rarely find I need delete ... using. > ... but > since we did, I think we should try to make it as flexible as the > corresponding feature in other DBMSes. > > > +1 cheers andrew
On Tue, Aug 25, 2009 at 2:50 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > On the performance front: yeah, you can recast most joins as subqueries, > but you tend to end up with the equivalent of a nestloop plan. Works > okay for small numbers of rows, scales horribly. Well that's our problem isn't it? I thought we were capable of genearting semijoins for subqueries these days though? The problem I thought was if you wanted to pull multiple values out of the subquery. So something like UPDATE foo set a=aa, b=bb FROM bar WHERE ... If you wanted to do an outer join from foo to bar then how would you write it as an subquery even if our optimizer could notice the semijoin and optimize it properly? You would have to write something like UPDATE foo set a = (select aa from bar where...) b = (select bb from bar where...) and then the optimizer would have to notice the duplicates and consolidate them? That seems inconvenient (and fragile). -- greg http://mit.edu/~gsstark/resume.pdf
Greg Stark <gsstark@mit.edu> writes: > You would have to write something like > UPDATE foo set a = (select aa from bar where...) > b = (select bb from bar where...) > and then the optimizer would have to notice the duplicates and > consolidate them? That seems inconvenient (and fragile). Well, that's why the spec nowadays allows you to write UPDATE foo SET (a,b) = (select aa,bb from bar where ...) But we haven't got that, and if we did it would generate a nestloop plan. Getting to the point of absolute performance equivalence between subqueries and joins would take a *lot* of work; I'm not even sure it's possible at all. And once we'd done all that work there would still remain the fact that people are accustomed to using join syntax instead. There's a lot of existing code out there that would be a lot easier to port to PG if we supported that style (which was exactly the point made by the OP). regards, tom lane