Thread: Bug or stupidity
Hello, I think, I found a bug, but maybe it's just my stupidity. Granted: What I did was an error on my part, but I still think, PostgreSQL should not do what it does. I've already created a simple testcase: popscan_light=> create table a (id serial, name varchar(10), primary key(id)) without oids; NOTICE: CREATE TABLE will create implicit sequence "a_id_seq" for "serial" column "a.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "a_pkey" for table "a" CREATE TABLE popscan_light=> create table b (id int4 references a (id) on delete cascade, name2 varchar(15), primary key (id)) without oids; NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "b_pkey" for table "b" CREATE TABLE popscan_light=> insert into a (name) values ('gnegg'); INSERT 0 1 popscan_light=> insert into a (name) values ('blepp'); INSERT 0 1 popscan_light=> insert into b values (1, 'gnegglink'); INSERT 0 1 popscan_light=> insert into b values (2, 'blepplink'); INSERT 0 1 popscan_light=> select a.name, b.name2 from a left join b using (id) order by b.name2; name | name2 -------+----------- blepp | blepplink gnegg | gnegglink (2 rows) popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join b aliasb using (id) order by b.name2; NOTICE: adding missing FROM-clause entry for table "b" name | name2 -------+----------- gnegg | gnegglink blepp | blepplink gnegg | gnegglink blepp | blepplink (4 rows) popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join b aliasb using (id) order by aliasb.name2; name | name2 -------+----------- blepp | blepplink gnegg | gnegglink (2 rows) popscan_light=> \q fangorn ~ $ psql --version psql (PostgreSQL) 7.4.3 contains support for command-line editing In the second "SELECT"-Query I've ordered the result set by the name-column of the second table, but I have not used the alias "aliasb" I created, but I used the full table name. I know this is not really correct, but I'd still like to know why Postgres throws 4 results at me. If I use the correct column in the order by clause, I get the correctly joined result. Looking at my second query, I think the false "order by" seems to pull in another copy of table b joining it without a proper condition. I don't think, this is the right thing to do. Or ist it? Anyone? Philip
On Sat, Oct 23, 2004 at 02:17:16PM +0000, Philip Hofstetter wrote: > Hello, > > I think, I found a bug, but maybe it's just my stupidity. Granted: What > I did was an error on my part, but I still think, PostgreSQL should not > do what it does. ... snip ... > popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join > b aliasb using (id) order by b.name2; > NOTICE: adding missing FROM-clause entry for table "b" > name | name2 > -------+----------- > gnegg | gnegglink > blepp | blepplink > gnegg | gnegglink > blepp | blepplink > (4 rows) See that NOTIVCE? It's telling you that it's converted your query to: select aliasa.name, aliasb.name2 from b, a aliasa left join b aliasb using (id) order by b.name2; Since you now have an unconstrained join on the B table, you get twice as many rows. It basically comes down to, if you make an alias, you have to use the alias. You can't use the original table name *and* the alias. The reference to the original table is becomes another copy of the same table. As for what's SQL standard, I think by a strict definition your query shouldn't be allowed at all, referencing an undefined table. Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Hi, Martijn van Oosterhout wrote: >>popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join >>b aliasb using (id) order by b.name2; >>NOTICE: adding missing FROM-clause entry for table "b" >> name | name2 >>-------+----------- >> gnegg | gnegglink >> blepp | blepplink >> gnegg | gnegglink >> blepp | blepplink >>(4 rows) > > > See that NOTIVCE? It's telling you that it's converted your query to: actually, I've overseen it. But then, my assumption in my mail was correct anyway. > select aliasa.name, aliasb.name2 from b, a aliasa left join > b aliasb using (id) order by b.name2; > Since you now have an unconstrained join on the B table, you get twice > as many rows. This is what I thought. > As for what's SQL standard, I think by a strict definition your query > shouldn't be allowed at all, referencing an undefined table. This is exactly what I think too. I mean: I know I made an error in my query. It would just have been easier to find if PostgreSQL actually had told me so (I'm not getting those NOTICEs from PHP...). If it's wrong, it should be disallowed, not made worse by assuming a completely wrong thing. Thanks for your fast response anyway. Philip
On Sat, Oct 23, 2004 at 02:35:20PM +0000, Philip Hofstetter wrote: > >As for what's SQL standard, I think by a strict definition your query > >shouldn't be allowed at all, referencing an undefined table. > > This is exactly what I think too. I mean: I know I made an error in my > query. It would just have been easier to find if PostgreSQL actually had > told me so (I'm not getting those NOTICEs from PHP...). > > If it's wrong, it should be disallowed, not made worse by assuming a > completely wrong thing. Maybe, ofcourse, this exact same construct is used heavily in DELETEs. Look at the syntax of the delete command: DELETE FROM [ ONLY ] table [ WHERE condition ] You can't declare extra tables or define aliases. Every other table used in the query is by strict definitions "undefined". Should they all be declared illegal too? Perhaps you could argue that using undeclared tables is allowed for DELETEs but not for SELECTs. But why make a distiction if you don't need to. Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
On Sat, 23 Oct 2004, Philip Hofstetter wrote: > > As for what's SQL standard, I think by a strict definition your query > > shouldn't be allowed at all, referencing an undefined table. > > This is exactly what I think too. I mean: I know I made an error in my > query. It would just have been easier to find if PostgreSQL actually had > told me so (I'm not getting those NOTICEs from PHP...). > > If it's wrong, it should be disallowed, not made worse by assuming a > completely wrong thing. It's enabled in large part for backwards compatibility. There's a runtime option that controls the behavior (add_missing_from).
On Sat, 2004-10-23 at 07:35, Philip Hofstetter wrote: > <snip> It would just have been easier to find if PostgreSQL actually had > told me so (I'm not getting those NOTICEs from PHP...). As far as I can tell, Apache or PHP snarfs up all the messages that postgres generates before they can get to the postgres log. In order to see them, these are my entries from php.ini: 1. error_reporting = E_ALL 2. display_errors = Off 3. log_errors = On 4. log_errors_max_len = 2048 In english: 1. Every freakin message you see 2. don't put em on the web page 3. just my log file 4. and show me all my long queries. On my system, everything ends up in apache's error_log. HTH, \<.
Stephan Szabo wrote: > It's enabled in large part for backwards compatibility. There's a runtime > option that controls the behavior (add_missing_from). > IMHO, it would be a more natural choice to have the add_missing_from disabled by default. Why would anyone *ever* want faulty SQL being magically "patched up" by the dbms? Ok, so some older installations might break when this is changed but the option is still there. Let applications that depend on this somewhat magical behavior enable it rather than have everyone else potentially run into the same problem as Philip. Regards, Thomas Hallgren
* Thomas Hallgren <thhal@mailblocks.com> [2004-10-25 15:52:20 +0200]: > IMHO, it would be a more natural choice to have the add_missing_from > disabled by default. Why would anyone *ever* want faulty SQL being > magically "patched up" by the dbms? That assumes that developers will implement queries in their code without testing them. Unfortunately, that's probably not too far from reality. I've thought of it as a nice "debugging" feature while I'm trying to hammer out a complicated query for the first time. -- Steven Klassen - Lead Programmer Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Replication & Support Services, (503) 667-4564
On Mon, 25 Oct 2004, Thomas Hallgren wrote: > Stephan Szabo wrote: > > > It's enabled in large part for backwards compatibility. There's a > runtime > > option that controls the behavior (add_missing_from). > > > IMHO, it would be a more natural choice to have the add_missing_from > disabled by default. Why would anyone *ever* want faulty SQL being In general, when we add a backwards compatibility option, we give a couple of versions before the default is changed. In addition, until we have a form of delete which allows a "from" list, there are some queries which are really more naturally written in a form similar to add_missing_from (although "from" lists would be better). > magically "patched up" by the dbms? I think that many people do, even if they don't realize it. Pretty much almost any extension to the spec is faulty SQL, from != and use of column aliases in some places they technically aren't allowed to DISTINCT ON and UPDATE FROM.
Steven, > That assumes that developers will implement queries in their code > without testing them. Unfortunately, that's probably not too far from > reality. I've thought of it as a nice "debugging" feature while I'm > trying to hammer out a complicated query for the first time. I don't see how that makes a difference really. As a developer, I'd rather prefer if I get an explanatory error result rather than a notice (often invisible) and an incorrect result when testing. If I don't test at all (God forbid) I want the same thing to happen the first time the code is deployed. Anything else is really scary. I don't see how it can be the dbms responsibility to correct erroneous SQL ever. It's comparable to having a compiler that magically adds undeclared (or misspelled) variables in your code. Shrug... Is the variable settable in a session? If so, that would be good for the purpose you mention. Regards, Thomas Hallgren
* Thomas Hallgren <thhal@mailblocks.com> [2004-10-25 19:06:40 +0200]: > I don't see how that makes a difference really. /me notes the timestamp on his post and vows never to post before 8am again. -- Steven Klassen - Lead Programmer Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Replication & Support Services, (503) 667-4564
Stephan, > In general, when we add a backwards compatibility option, we give > a couple of versions before the default is changed. > Perhaps the 8.0 would be a perfect time since it's a change of the major number. > In addition, until we have a form of delete which allows a "from" > list, there are some queries which are really more naturally written > in a form similar to add_missing_from > (although "from" lists would be better). > Still, if the query is incorrect, I want to know about it. I don't ever want an incorrect behavior as a result of some behind the scenes magic. For me, there's no exception to that rule and my guess is that very few people would disagree if they think about it more in depth. This option helps no one. It only adds to the confusion. > I think that many people do, even if they don't realize it. > If people write incorrect SQL because "this looks like the natural way of doing it", don't you think it's fair if they find out about the error ASAP? Catching errors early in the development process is generally considered a good thing. When this option is enabled, errors might be hidden (you get the notification that not everyone will pay attention to, or even see). I consider that a very *bad* thing. It's perhaps OK that the option exists so that old legacy system can keep on running, but to have it enabled by default is not good at all. Regards, Thomas Hallgren
Thomas Hallgren wrote: > Steven, > > > That assumes that developers will implement queries in their code > > without testing them. Unfortunately, that's probably not too far from > > reality. I've thought of it as a nice "debugging" feature while I'm > > trying to hammer out a complicated query for the first time. > > I don't see how that makes a difference really. As a developer, I'd > rather prefer if I get an explanatory error result rather than a notice > (often invisible) and an incorrect result when testing. If I don't test > at all (God forbid) I want the same thing to happen the first time the > code is deployed. This particular error may be less than obvious even during crude testing because the number of tuples in the relation in question may be zero or one, so the cross-product wouldn't produce anything unusual. Mike Mascari
On Mon, 25 Oct 2004, Thomas Hallgren wrote: > Stephan, > > > In general, when we add a backwards compatibility option, we give > > a couple of versions before the default is changed. > > > Perhaps the 8.0 would be a perfect time since it's a change of the major > number. Maybe, but I think it'll be a hard sell without a replacement for the delete form that works when it's off. > > In addition, until we have a form of delete which allows a "from" > > list, there are some queries which are really more naturally written > > in a form similar to add_missing_from > > (although "from" lists would be better). > > > Still, if the query is incorrect, I want to know about it. I don't ever But, is the query incorrect? It does what PostgreSQL says it will. That's not what the spec says it'll do, but the same is true of most of the extensions, and I don't think people generally consider queries using those as incorrect.
Stephan, >>Perhaps the 8.0 would be a perfect time since it's a change of the major >>number. >> >> > >Maybe, but I think it'll be a hard sell without a replacement for the >delete form that works when it's off. > > I'm not sure I understand this. Apparently you want tables to be added to the FROM clause of a DELETE statement. Why? If you have more than one table listed in the FROM clause, how do you know which one that is subject to DELETE? Surely you're not suggesting that the DELETE should affect more than one table? If the WHERE clause that defines the criteria for deletion involves more than one table, then you'd use a sub select and that has a FROM clause of its own. Can you give me an example on what you mean? >> > In addition, until we have a form of delete which allows a "from" >> > list, there are some queries which are really more naturally written >> > in a form similar to add_missing_from >> > (although "from" lists would be better). >> > >>Still, if the query is incorrect, I want to know about it. I don't ever >> >> > >But, is the query incorrect? It does what PostgreSQL says it will. >That's not what the spec says it'll do, but the same is true of most of >the extensions, and I don't think people generally consider queries using >those as incorrect. > > I haven't seen any other extension that, when enabled, attempts to "improve" badly written SQL in a way that potentially gives incorrect query results. As I said in another post, this is like having a compiler that instead of complaining about a misspelled variable, adds a new one. I can give you an example why I think this option is bad: Assume that you work with some client software. You write your queries and you test your system. You see some notifications passing by from the server (perhaps) but you don't pay much attention to them. You get notifications all the time for other reasons. Nothing to worry about. In fact, your test system is written to trigger on errors and warnings and ignore notifications. So all your tests run fine. You ship to your customers. The customers starts adding data to tables and finds some strange behavior. It turns out that everything is caused by tables being added to the FROM clause. You didn't see the problem in your test because there, the added table had less than 2 tuples in it. As I said before, I don't object to the presence of this "option" so that people that really knows _why_ they enable it can do so, but I strongly object to having this option enabled by default. I suggest that: 1. Have this option disabled by default. 2. Print WARNING's rather than notifications when tables are added. Regards, Thomas Hallgren
On Tue, Oct 26, 2004 at 05:25:57PM +0200, Thomas Hallgren wrote: > If the WHERE clause that defines the criteria for deletion involves more > than one table, then you'd use a sub select and that has a FROM clause > of its own. Sure, that's what you could do, but it makes the query rather more complex than it needs to be. For example, consider this statement: DELETE FROM x WHERE x.a = table.a and x.b > table.b and table.c = 4; Look, "table" is not declared anywhere, but I can't think of a way to rewrite this in strict SQL. Maybe: DELETE FROM x WHERE x.id IN (SELECT x.id FROM x, table where x.a = table.a and x.b > table.b and table.c = 4); Seems like a lot of extra work for no gain. Hope id is never NULL. Maybe EXISTS would work better? > I haven't seen any other extension that, when enabled, attempts to > "improve" badly written SQL in a way that potentially gives incorrect > query results. As I said in another post, this is like having a compiler > that instead of complaining about a misspelled variable, adds a new one. transform_equals_null comes to mind. It's a hack to make 'x = NULL' work the way people coming from Oracle expect. It "fixes" it to be 'x IS NULL'. That is arguably something that could cause unexpected results. > So all your tests run fine. You ship to your customers. The customers > starts adding data to tables and finds some strange behavior. It turns > out that everything is caused by tables being added to the FROM clause. > You didn't see the problem in your test because there, the added table > had less than 2 tuples in it. It has to be exactly one tuple. If there are zero tuples you get zero output. Cross-joining with an empty table produces no output. You're shipping a product where people expect to be able to add more rows to a table, but you never test that? > As I said before, I don't object to the presence of this "option" so > that people that really knows _why_ they enable it can do so, but I > strongly object to having this option enabled by default. I suggest that: > > 1. Have this option disabled by default. > 2. Print WARNING's rather than notifications when tables are added. If you're not seeing NOTICEs now, what makes you think you'll see WARNINGs? Every DB interface I've used so far displays the notices where I can see them. This notice is one of the less useful, there are other more useful warnings which are much more handy to see... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
On Tue, Oct 26, 2004 at 06:21:23PM +0200, Thomas Hallgren wrote: > Do you consider this overly complex? Compare: > > DELETE FROM x WHERE EXISTS (SELECT * FROM table WHERE x.a = table.a and > x.b > table.b and table.c = 4) > > to: > > DELETE FROM x, table WHERE x.a = table.a and x.b > table.b and table.c = 4 > > In the latter, what is it you are deleting? Is it x or table? I'm not at > all in favor of listing several tables in the FROM clause of a DELETE > statement (that includes implicitly adding them). The problem is that in DELETE, there is no FROM clause in the sense there is with the other commands, the FROM keyword is used for a different purpose. The FROM clause the tables are automatically added to does not have an equivalent in the original SQL statement. I'm in favour of the status quo, exactly the current default behaviour. That second example you give is confusing and should be disallowed. But no-one has come up with anything better. Do you have a better suggestion, other than forbidding the currently allowed syntax? > >Every DB interface I've used so far displays the notices > >where I can see them. This notice is one of the less useful, there > >are other more useful warnings which are much more handy to see... > > > > > Right. Useful "warnings"! Seems you agree that this should be a warning, > not a notice. Hmm, I consider a notice to be a warning anyway, something you should always read. The default log level is notice anyway, so if you're seeing warnings, you'll see the notices too... Anyway, I think the reasoning so far is, the default stays as it is until someone comes up with a non-confusing way of adding a real FROM clause to DELETEs. Requiring people upgrading to add missing tables in the FROM for SELECT and UPDATE is one thing. Asking them to rewrite every DELETE query as a subselect is a bit too far. It would be nice also because then you could then also use aliases. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Martijn, > Do you have a better > suggestion, other than forbidding the currently allowed syntax? > Yes I do. We agree that my second example should be disallowed since the semantics of the FROM clause is different for a DELETE so the "add_missing_from" is actually not adding to a FROM clause, it is guessing the scope for the predicate. I assume the same is true for an UPDATE where there is no FROM at all. My suggestion is that we rename the add_missing_from to: update_delete_autoscope and that this option has no effect on SELECT clauses. It would be more or less harmless to have it enabled by default. Perhaps the add_missing_from should remain but then only affect the SELECT and as disabled by default. > Anyway, I think the reasoning so far is, the default stays as it is > until someone comes up with a non-confusing way of adding a real FROM > clause to DELETEs. > SQL already defines a stright forward way to do that. Consider the following PostgreSQL syntax: DELETE FROM first_table WHERE first_table.id = second_table.xid AND second_table.foo > 4 in standard SQL this would be: DELETE FROM first_table x WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4) The number of characters is almost the same in both statements even for a very simple WHERE clause thanks to aliasing. The benefits of aliasing increases as the WHERE clause gets more complicated. For composite keys or other non key based relationships, the EXISTS clause can be used. Why confuse people with yet another syntax? Regards, Thomas Hallgren
On Wed, Oct 27, 2004 at 12:15:10AM +0200, Thomas Hallgren wrote: > Martijn, > > Do you have a better > >suggestion, other than forbidding the currently allowed syntax? > > > Yes I do. > > We agree that my second example should be disallowed since the semantics > of the FROM clause is different for a DELETE so the "add_missing_from" > is actually not adding to a FROM clause, it is guessing the scope for > the predicate. I assume the same is true for an UPDATE where there is no > FROM at all. Not true, UPDATE in PostgreSQL does allow a from clause. Observe: # \h update Command: UPDATE Description: update rows of a table Syntax: UPDATE [ ONLY ] table SET col = expression [, ...] [ FROM fromlist ] [ WHERE condition ] Perfectly reasonable addition, but not strictly SQL standard. Also, the scope is not guessed, it's totally unambiguous. I avoid the issue entirly by either never using aliases, or always using aliases, hence the issue doesn't come up, but that's me. Anyway, I think there's a confusion in the phrase "from clause". From the server's point of view, it's the list of tables the query is working with and this applies to all kinds of queries, DELETE, SELECT and UPDATE alike. Internally all those queries are processed the same, it's just what happens to the selected rows that changes. SELECT and UPDATE allow you to explicitly list tables, DELETE doesn't. The bit after FROM in a DELETE query is *not* the from clause by this definition. But I guess it comes down to to how strictly you want to follow the SQL standard. > My suggestion is that we rename the add_missing_from to: > > update_delete_autoscope > > and that this option has no effect on SELECT clauses. It would be more > or less harmless to have it enabled by default. As pointed out above, it's not needed to update. And add_missing_from currently has no effect on delete, so your suggested option appears to be merely the inverse of what is already there. > DELETE FROM first_table x > WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4) > > The number of characters is almost the same in both statements even for > a very simple WHERE clause thanks to aliasing. The benefits of aliasing > increases as the WHERE clause gets more complicated. The SQL standard (what I can find on the web anyway) doesn't allow an alias there, and neither does PostgreSQL. Incidently, MS SQL server allows the following syntax: DELETE FROM Table1 FROM Table1 INNER JOIN Table2 ON .... The UPDATE syntax extension I mentioned above is also in MS SQL as far as I can tell (I've never personally used it). Would adding support for a from clause there make a difference to you? Ref: http://www.mvps.org/access/queries/qry0022.htm > Why confuse people with yet another syntax? Why confuse people by changing a perfectly usable syntax, that's been present for years (since the beginning I beleive) and generates NOTICEs already. The difference between NOTICEs and WARNINGs is that NOTICEs are expected, a direct consequence of the query, whereas warnings are unexpected, change each time you run the query. By that definition it clearly should be a NOTICE. Anyway, this isn't going anywhere. Neither of us is going to make any changes to the server. And the core has decided to leave it as is for the time being. Maybe after 8.0 is released it can be revisited. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
I didn't see that join syntax in the documentation for delete, thanks for pointing it out. MS SQL Server syntax for a delete is a little less confusing, IMHO. instead of DELETE FROM x WHERE x.a = table.a and x.b > table.b and table.c = 4; they have DELETE x FROM x join table on x.a = table.a and x.b > table.b and table.c = 4 the table being deleted from is listed separately, but you can still have full join syntax (including outer joins) to help with the deletion. This is similar to the current PostGreSQL update syntax, except that the table being updated is not part of the from and therefore can only be connected through an inner join, not an outer join. Thank You Sim Zacks IT Manager CompuLab 04-829-0145 - Office 04-832-5251 - Fax ________________________________________________________________________________ On Tue, Oct 26, 2004 at 06:21:23PM +0200, Thomas Hallgren wrote: > Do you consider this overly complex? Compare: > > DELETE FROM x WHERE EXISTS (SELECT * FROM table WHERE x.a = table.a and > x.b > table.b and table.c = 4) > > to: > > DELETE FROM x, table WHERE x.a = table.a and x.b > table.b and table.c = 4 > > In the latter, what is it you are deleting? Is it x or table? I'm not at > all in favor of listing several tables in the FROM clause of a DELETE > statement (that includes implicitly adding them). The problem is that in DELETE, there is no FROM clause in the sense there is with the other commands, the FROM keyword is used for a different purpose. The FROM clause the tables are automatically added to does not have an equivalent in the original SQL statement. I'm in favour of the status quo, exactly the current default behaviour. That second example you give is confusing and should be disallowed. But no-one has come up with anything better. Do you have a better suggestion, other than forbidding the currently allowed syntax? > >Every DB interface I've used so far displays the notices > >where I can see them. This notice is one of the less useful, there > >are other more useful warnings which are much more handy to see... > > > > > Right. Useful "warnings"! Seems you agree that this should be a warning, > not a notice. Hmm, I consider a notice to be a warning anyway, something you should always read. The default log level is notice anyway, so if you're seeing warnings, you'll see the notices too... Anyway, I think the reasoning so far is, the default stays as it is until someone comes up with a non-confusing way of adding a real FROM clause to DELETEs. Requiring people upgrading to add missing tables in the FROM for SELECT and UPDATE is one thing. Asking them to rewrite every DELETE query as a subselect is a bit too far. It would be nice also because then you could then also use aliases. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Martijn, I realize that the change I'm proposing might be too complex to be added in the upcoming 8.0 release. I do find this discussion interesting though, so please bear with me while I try to tie up some loose ends. >UPDATE [ ONLY ] table SET col = expression [, ...] > [ FROM fromlist ] > [ WHERE condition ] > >Perfectly reasonable addition, but not strictly SQL standard. Also, the >scope is not guessed, it's totally unambiguous. > Ok, bad choice of words. It's not guessed, and I agree, this is perfectly reasonable. >Anyway, I think there's a confusion in the phrase "from clause". > There's no confusion. I fully understand the differences. That's why think that the term 'add_missing_from' is misleading. From a strict syntax point of view it implies expansion to the statement we both agreed should be disallowed. The fact that it doesn't actually add a missing from but rather expands the scope for the predicate is somewhat confusing. Hence my suggestion that the variable is renamed. >But I guess it comes down to to how strictly you want to follow the SQL >standard. > > I think it's OK to deviate from the standard and add features. My whole argument in this thread is based on the fact that PostgreSQL adds tables to the FROM clause of a SELECT which may produce incorrect results and that this "magic" is performed by default. >>My suggestion is that we rename the add_missing_from to: >> >>update_delete_autoscope >> >>and that this option has no effect on SELECT clauses. It would be more >>or less harmless to have it enabled by default. >> >> >As pointed out above, it's not needed to update. And add_missing_from >currently has no effect on delete, so your suggested option appears to >be merely the inverse of what is already there. > > What I was trying to say is that: a) since the 'add_missing_from' affects the predicate scope for DELETE's, UPDATE's, and SELECT's, and since those statements have different ways of expressing this scope, the current choice of name is a bit confusing and b) it would be nice if the variable affected DELETE and UPDATE scopes only. Now you point out that an UPDATE can have a FROM clause, so let me revise my suggestion and instead say: 1. Let's add a variable named "autoscope_for_delete" that is enabled by default and only affects the scope of a DELETE predicate. We do this to maintain backward compatibility. 2. Let's change so that "add_missing_from" is disabled by default and doesn't affect the DELETE statement at all. 3. The "autoscope_for_delete" will use generate notices and "add_missing_from" will generate warnings. >>DELETE FROM first_table x >> WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4) >> >> >The SQL standard (what I can find on the web anyway) doesn't allow an >alias there, and neither does PostgreSQL. > The SQL 2003 draft I have states: <delete statement: searched> ::= DELETE FROM <target table> [ [ AS ] <correlation name> ] [ WHERE <search condition> ] whereas SQL 3 is more elaborated: <table reference> ::= <table name> [ [ AS ] <correlation name> [ <left paren> <derived column list> <right paren> ] ] | <derived table> [ AS ] <correlation name> [ <left paren> <derived column list> <right paren> ] | <joined table> <delete statement: searched> ::= DELETE FROM <table reference> [ WHERE <search condition> ] Perhaps PostgreSQL should adopt this? >Incidently, MS SQL server allows the following syntax: > >DELETE FROM Table1 FROM Table1 INNER JOIN Table2 ON .... > >The UPDATE syntax extension I mentioned above is also in MS SQL as far >as I can tell (I've never personally used it). Would adding support for >a from clause there make a difference to you? > > I'm happy as long as the 'add_missing_from' is disabled or changed so that it doesn't affect SELECT. And yes, this extension looks good. Perhaps consider changing the second FROM to USING (mimicking MySQL instead of MS SQL server). I think it would lessen the risk of introducing ambiguities in the parser (and it looks better than repeated FROM's). Regards, Thomas Hallgren
On Wed, Oct 27, 2004 at 22:10:05 +0200, > 2. Let's change so that "add_missing_from" is disabled by default and > doesn't affect the DELETE statement at all. That is supposed to happen. My memory was that 8.0 was the release that the default was going to change, but if not then it should be 8.1. I don't see any great reason to change the name at this point. That is going to just cause more problems.
Bruno Wolff III <bruno@wolff.to> writes: > On Wed, Oct 27, 2004 at 22:10:05 +0200, >> 2. Let's change so that "add_missing_from" is disabled by default and >> doesn't affect the DELETE statement at all. > That is supposed to happen. My memory was that 8.0 was the release that > the default was going to change, but if not then it should be 8.1. add_missing_from was only added in 7.4; the default behavior goes all the way back because we inherited it from PostQUEL. It's probably premature to flip the factory default after only one release cycle, especially given the DELETE deficiency. A reasonable position is to flip the default one release cycle after we fix the DELETE syntax. It is interesting that SQL2003 allows an alias on the UPDATE or DELETE target table; that was definitely not there in SQL99 or earlier. We'll want to add that, for sure, but it is just a notational convenience. There are several threads in the archives about how to fix the DELETE syntax, but I don't think we ever really got consensus on what keyword to use to introduce the auxiliary FROM clause. regards, tom lane
Martijn van Oosterhout wrote: >Sure, that's what you could do, but it makes the query rather more >complex than it needs to be. > > Do you consider this overly complex? Compare: DELETE FROM x WHERE EXISTS (SELECT * FROM table WHERE x.a = table.a and x.b > table.b and table.c = 4) to: DELETE FROM x, table WHERE x.a = table.a and x.b > table.b and table.c = 4 In the latter, what is it you are deleting? Is it x or table? I'm not at all in favor of listing several tables in the FROM clause of a DELETE statement (that includes implicitly adding them). >transform_equals_null comes to mind. It's a hack to make 'x = NULL' >work the way people coming from Oracle expect. It "fixes" it to be 'x >IS NULL'. > >That is arguably something that could cause unexpected results. > > I assume you mean transform_null_equals. If so, you just made my point. It's disabled by default. Probably for the reason you mention. >It has to be exactly one tuple. If there are zero tuples you get zero >output. Cross-joining with an empty table produces no output. You're >shipping a product where people expect to be able to add more rows to a >table, but you never test that? > > So how is this relevant to the argument? This is not about the capabilities of an imaginary test framework. It was just an example! >>As I said before, I don't object to the presence of this "option" so >>that people that really knows _why_ they enable it can do so, but I >>strongly object to having this option enabled by default. I suggest that: >> >>1. Have this option disabled by default. >>2. Print WARNING's rather than notifications when tables are added. >> >> > >If you're not seeing NOTICEs now, what makes you think you'll see >WARNINGs? > It's not totally uncommon for a test framework to trigger on warnings and errors (for obvious reasons). My imaginary test actually did just that (as stated). >Every DB interface I've used so far displays the notices >where I can see them. This notice is one of the less useful, there >are other more useful warnings which are much more handy to see... > > Right. Useful "warnings"! Seems you agree that this should be a warning, not a notice. Regards, Thomas Hallgren
Martijn, I realize that the change I'm proposing might be too complex to be added in the upcoming 8.0 release. I do find this discussion interesting though, so please bear with me while I try to tie up some loose ends. >UPDATE [ ONLY ] table SET col = expression [, ...] > [ FROM fromlist ] > [ WHERE condition ] > >Perfectly reasonable addition, but not strictly SQL standard. Also, the >scope is not guessed, it's totally unambiguous. > Ok, bad choice of words. It's not guessed, and I agree, this is perfectly reasonable. >Anyway, I think there's a confusion in the phrase "from clause". > There's no confusion. I fully understand the differences. That's why think that the term 'add_missing_from' is misleading. From a strict syntax point of view it implies expansion to the statement we both agreed should be disallowed. The fact that it doesn't actually add a missing from but rather expands the scope for the predicate is somewhat confusing. Hence my suggestion that the variable is renamed. >But I guess it comes down to to how strictly you want to follow the SQL >standard. > > I think it's OK to deviate from the standard and add features. My whole argument in this thread is based on the fact that PostgreSQL adds tables to the FROM clause of a SELECT which may produce incorrect results and that this "magic" is performed by default. >>My suggestion is that we rename the add_missing_from to: >> >>update_delete_autoscope >> >>and that this option has no effect on SELECT clauses. It would be more >>or less harmless to have it enabled by default. >> >> >As pointed out above, it's not needed to update. And add_missing_from >currently has no effect on delete, so your suggested option appears to >be merely the inverse of what is already there. > > What I was trying to say is that: a) since the 'add_missing_from' affects the predicate scope for DELETE's, UPDATE's, and SELECT's, and since those statements have different ways of expressing this scope, the current choice of name is a bit confusing and b) it would be nice if the variable affected DELETE and UPDATE scopes only. Now you point out that an UPDATE can have a FROM clause, so let me revise my suggestion and instead say: 1. Let's add a variable named "autoscope_for_delete" that is enabled by default and only affects the scope of a DELETE predicate. We do this to maintain backward compatibility. 2. Let's change so that "add_missing_from" is disabled by default and doesn't affect the DELETE statement at all. 3. The "autoscope_for_delete" will use generate notices and "add_missing_from" will generate warnings. >>DELETE FROM first_table x >> WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4) >> >> >The SQL standard (what I can find on the web anyway) doesn't allow an >alias there, and neither does PostgreSQL. > The SQL 2003 draft I have states: <delete statement: searched> ::= DELETE FROM <target table> [ [ AS ] <correlation name> ] [ WHERE <search condition> ] whereas SQL 3 is more elaborated: <table reference> ::= <table name> [ [ AS ] <correlation name> [ <left paren> <derived column list> <right paren> ] ] | <derived table> [ AS ] <correlation name> [ <left paren> <derived column list> <right paren> ] | <joined table> <delete statement: searched> ::= DELETE FROM <table reference> [ WHERE <search condition> ] Perhaps PostgreSQL should adopt this? >Incidently, MS SQL server allows the following syntax: > >DELETE FROM Table1 FROM Table1 INNER JOIN Table2 ON .... > >The UPDATE syntax extension I mentioned above is also in MS SQL as far >as I can tell (I've never personally used it). Would adding support for >a from clause there make a difference to you? > > I'm happy as long as the 'add_missing_from' is disabled or changed so that it doesn't affect SELECT. And yes, this extension looks good. Perhaps consider changing the second FROM to USING (mimicking MySQL instead of MS SQL server). I think it would lessen the risk of introducing ambiguities in the parser (and it looks better than repeated FROM's). Regards, Thomas Hallgren