Thread: Bug or stupidity

Bug or stupidity

From
Philip Hofstetter
Date:
Hello,

I think, I found a bug, but maybe it's just my stupidity. Granted: What
I did was an error on my part, but I still think, PostgreSQL should not
do what it does.

I've already created a simple testcase:


popscan_light=> create table a (id serial, name varchar(10), primary
key(id)) without oids;
NOTICE:  CREATE TABLE will create implicit sequence "a_id_seq" for
"serial" column "a.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "a_pkey"
for table "a"
CREATE TABLE
popscan_light=> create table b (id int4 references a (id) on delete
cascade, name2 varchar(15), primary key (id)) without oids;
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "b_pkey"
for table "b"
CREATE TABLE
popscan_light=> insert into a (name) values ('gnegg');
INSERT 0 1
popscan_light=> insert into a (name) values ('blepp');
INSERT 0 1
popscan_light=> insert into b values (1, 'gnegglink');
INSERT 0 1
popscan_light=> insert into b values (2, 'blepplink');
INSERT 0 1
popscan_light=> select a.name, b.name2 from a left join b using (id)
order by b.name2;
  name  |   name2
-------+-----------
  blepp | blepplink
  gnegg | gnegglink
(2 rows)

popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join
b aliasb using (id) order by b.name2;
NOTICE:  adding missing FROM-clause entry for table "b"
  name  |   name2
-------+-----------
  gnegg | gnegglink
  blepp | blepplink
  gnegg | gnegglink
  blepp | blepplink
(4 rows)

popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join
b aliasb using (id) order by aliasb.name2;
  name  |   name2
-------+-----------
  blepp | blepplink
  gnegg | gnegglink
(2 rows)

popscan_light=> \q
fangorn ~ $ psql --version
psql (PostgreSQL) 7.4.3
contains support for command-line editing

In the second "SELECT"-Query I've ordered the result set by the
name-column of the second table, but I have not used the alias "aliasb"
I created, but I used the full table name. I know this is not really
correct, but I'd still like to know why Postgres throws 4 results at me.

If I use the correct column in the order by clause, I get the correctly
joined result.

Looking at my second query, I think the false "order by" seems to pull
in another copy of table b joining it without a proper condition. I
don't think, this is the right thing to do.

Or ist it?

Anyone?

Philip

Re: Bug or stupidity

From
Martijn van Oosterhout
Date:
On Sat, Oct 23, 2004 at 02:17:16PM +0000, Philip Hofstetter wrote:
> Hello,
>
> I think, I found a bug, but maybe it's just my stupidity. Granted: What
> I did was an error on my part, but I still think, PostgreSQL should not
> do what it does.

... snip ...

> popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join
> b aliasb using (id) order by b.name2;
> NOTICE:  adding missing FROM-clause entry for table "b"
>  name  |   name2
> -------+-----------
>  gnegg | gnegglink
>  blepp | blepplink
>  gnegg | gnegglink
>  blepp | blepplink
> (4 rows)

See that NOTIVCE? It's telling you that it's converted your query to:

select aliasa.name, aliasb.name2 from b, a aliasa left join
b aliasb using (id) order by b.name2;

Since you now have an unconstrained join on the B table, you get twice
as many rows.

It basically comes down to, if you make an alias, you have to use the
alias. You can't use the original table name *and* the alias. The
reference to the original table is becomes another copy of the same
table.

As for what's SQL standard, I think by a strict definition your query
shouldn't be allowed at all, referencing an undefined table.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Bug or stupidity

From
Philip Hofstetter
Date:
Hi,

Martijn van Oosterhout wrote:
>>popscan_light=> select aliasa.name, aliasb.name2 from a aliasa left join
>>b aliasb using (id) order by b.name2;
>>NOTICE:  adding missing FROM-clause entry for table "b"
>> name  |   name2
>>-------+-----------
>> gnegg | gnegglink
>> blepp | blepplink
>> gnegg | gnegglink
>> blepp | blepplink
>>(4 rows)
>
>
> See that NOTIVCE? It's telling you that it's converted your query to:

actually, I've overseen it. But then, my assumption in my mail was
correct anyway.

> select aliasa.name, aliasb.name2 from b, a aliasa left join
> b aliasb using (id) order by b.name2;

> Since you now have an unconstrained join on the B table, you get twice
> as many rows.

This is what I thought.

> As for what's SQL standard, I think by a strict definition your query
> shouldn't be allowed at all, referencing an undefined table.

This is exactly what I think too. I mean: I know I made an error in my
query. It would just have been easier to find if PostgreSQL actually had
told me so (I'm not getting those NOTICEs from PHP...).

If it's wrong, it should be disallowed, not made worse by assuming a
completely wrong thing.

Thanks for your fast response anyway.

Philip

Re: Bug or stupidity

From
Martijn van Oosterhout
Date:
On Sat, Oct 23, 2004 at 02:35:20PM +0000, Philip Hofstetter wrote:
> >As for what's SQL standard, I think by a strict definition your query
> >shouldn't be allowed at all, referencing an undefined table.
>
> This is exactly what I think too. I mean: I know I made an error in my
> query. It would just have been easier to find if PostgreSQL actually had
> told me so (I'm not getting those NOTICEs from PHP...).
>
> If it's wrong, it should be disallowed, not made worse by assuming a
> completely wrong thing.

Maybe, ofcourse, this exact same construct is used heavily in DELETEs.
Look at the syntax of the delete command:

DELETE FROM [ ONLY ] table [ WHERE condition ]

You can't declare extra tables or define aliases. Every other table
used in the query is by strict definitions "undefined". Should they all
be declared illegal too?

Perhaps you could argue that using undeclared tables is allowed for
DELETEs but not for SELECTs. But why make a distiction if you don't
need to.

Hope this helps,

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Bug or stupidity

From
Stephan Szabo
Date:
On Sat, 23 Oct 2004, Philip Hofstetter wrote:

> > As for what's SQL standard, I think by a strict definition your query
> > shouldn't be allowed at all, referencing an undefined table.
>
> This is exactly what I think too. I mean: I know I made an error in my
> query. It would just have been easier to find if PostgreSQL actually had
> told me so (I'm not getting those NOTICEs from PHP...).
>
> If it's wrong, it should be disallowed, not made worse by assuming a
> completely wrong thing.

It's enabled in large part for backwards compatibility.  There's a runtime
option that controls the behavior (add_missing_from).


Re: Bug or stupidity

From
Karim Nassar
Date:
On Sat, 2004-10-23 at 07:35, Philip Hofstetter wrote:
> <snip> It would just have been easier to find if PostgreSQL actually had
> told me so (I'm not getting those NOTICEs from PHP...).

As far as I can tell, Apache or PHP snarfs up all the messages that
postgres generates before they can get to the postgres log.

In order to see them, these are my entries from php.ini:

1. error_reporting  =  E_ALL
2. display_errors = Off
3. log_errors = On
4. log_errors_max_len = 2048

In english:

1. Every freakin message you see
2. don't put em on the web page
3. just my log file
4. and show me all my long queries.

On my system, everything ends up in apache's error_log.

HTH,
\<.


Re: Bug or stupidity

From
Thomas Hallgren
Date:
Stephan Szabo wrote:

 > It's enabled in large part for backwards compatibility.  There's a
runtime
 > option that controls the behavior (add_missing_from).
 >
IMHO, it would be a more natural choice to have the add_missing_from
disabled by default. Why would anyone *ever* want faulty SQL being
magically "patched up" by the dbms?

Ok, so some older installations might break when this is changed but the
option is still there. Let applications that depend on this somewhat
magical behavior enable it rather than have everyone else potentially
run into the same problem as Philip.

Regards,

Thomas Hallgren

Re: Bug or stupidity

From
Steven Klassen
Date:
* Thomas Hallgren <thhal@mailblocks.com> [2004-10-25 15:52:20 +0200]:

> IMHO, it would be a more natural choice to have the add_missing_from
> disabled by default. Why would anyone *ever* want faulty SQL being
> magically "patched up" by the dbms?

That assumes that developers will implement queries in their code
without testing them. Unfortunately, that's probably not too far from
reality. I've thought of it as a nice "debugging" feature while I'm
trying to hammer out a complicated query for the first time.

--
Steven Klassen - Lead Programmer
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Replication & Support Services, (503) 667-4564

Re: Bug or stupidity

From
Stephan Szabo
Date:
On Mon, 25 Oct 2004, Thomas Hallgren wrote:

> Stephan Szabo wrote:
>
>  > It's enabled in large part for backwards compatibility.  There's a
> runtime
>  > option that controls the behavior (add_missing_from).
>  >
> IMHO, it would be a more natural choice to have the add_missing_from
> disabled by default. Why would anyone *ever* want faulty SQL being

In general, when we add a backwards compatibility option, we give a couple
of versions before the default is changed. In addition, until we have a
form of delete which allows a "from" list, there are some queries which
are really more naturally written in a form similar to add_missing_from
(although "from" lists would be better).

> magically "patched up" by the dbms?

I think that many people do, even if they don't realize it.  Pretty much
almost any extension to the spec is faulty SQL, from != and use of column
aliases in some places they technically aren't allowed to DISTINCT ON and
UPDATE FROM.

Re: Bug or stupidity

From
Thomas Hallgren
Date:
Steven,

 > That assumes that developers will implement queries in their code
 > without testing them. Unfortunately, that's probably not too far from
 > reality. I've thought of it as a nice "debugging" feature while I'm
 > trying to hammer out a complicated query for the first time.


I don't see how that makes a difference really. As a developer, I'd
rather prefer if I get an explanatory error result rather than a notice
(often invisible) and an incorrect result when testing. If I don't test
at all (God forbid) I want the same thing to happen the first time the
code is deployed. Anything else is really scary. I don't see how it can
be the dbms responsibility to correct erroneous SQL ever. It's
comparable to having a compiler that magically adds undeclared (or
misspelled) variables in your code. Shrug...

Is the variable settable in a session? If so, that would be good for the
purpose you mention.

Regards,
Thomas Hallgren

Re: Bug or stupidity

From
Steven Klassen
Date:
* Thomas Hallgren <thhal@mailblocks.com> [2004-10-25 19:06:40 +0200]:

> I don't see how that makes a difference really.

/me notes the timestamp on his post and vows never to post before 8am
again.

--
Steven Klassen - Lead Programmer
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Replication & Support Services, (503) 667-4564

Re: Bug or stupidity

From
Thomas Hallgren
Date:
Stephan,

 > In general, when we add a backwards compatibility option, we give
 > a couple of versions before the default is changed.
 >
Perhaps the 8.0 would be a perfect time since it's a change of the major
number.

 > In addition, until we have a form of delete which allows a "from"
 > list, there are some queries which are really more naturally written
 > in a form similar to add_missing_from
 > (although "from" lists would be better).
 >
Still, if the query is incorrect, I want to know about it. I don't ever
want an incorrect behavior as a result of some behind the scenes magic.
For me, there's no exception to that rule and my guess is that very few
people would disagree if they think about it more in depth. This option
helps no one. It only adds to the confusion.

 > I think that many people do, even if they don't realize it.
 >
If people write incorrect SQL because "this looks like the natural way
of doing it", don't you think it's fair if they find out about the error
ASAP? Catching errors early in the development process is generally
considered a good thing. When this option is enabled, errors might be
hidden (you get the notification that not everyone will pay attention
to, or even see). I consider that a very *bad* thing.

It's perhaps OK that the option exists so that old legacy system can
keep on running, but to have it enabled by default is not good at all.

Regards,
Thomas Hallgren

Re: Bug or stupidity

From
Mike Mascari
Date:
Thomas Hallgren wrote:
> Steven,
>
>  > That assumes that developers will implement queries in their code
>  > without testing them. Unfortunately, that's probably not too far from
>  > reality. I've thought of it as a nice "debugging" feature while I'm
>  > trying to hammer out a complicated query for the first time.
>
> I don't see how that makes a difference really. As a developer, I'd
> rather prefer if I get an explanatory error result rather than a notice
> (often invisible) and an incorrect result when testing. If I don't test
> at all (God forbid) I want the same thing to happen the first time the
> code is deployed.

This particular error may be less than obvious even during crude testing
because the number of tuples in the relation in question may be zero or
one, so the cross-product wouldn't produce anything unusual.

Mike Mascari

Re: Bug or stupidity

From
Stephan Szabo
Date:
On Mon, 25 Oct 2004, Thomas Hallgren wrote:

> Stephan,
>
>  > In general, when we add a backwards compatibility option, we give
>  > a couple of versions before the default is changed.
>  >
> Perhaps the 8.0 would be a perfect time since it's a change of the major
> number.

Maybe, but I think it'll be a hard sell without a replacement for the
delete form that works when it's off.

>  > In addition, until we have a form of delete which allows a "from"
>  > list, there are some queries which are really more naturally written
>  > in a form similar to add_missing_from
>  > (although "from" lists would be better).
>  >
> Still, if the query is incorrect, I want to know about it. I don't ever

But, is the query incorrect?  It does what PostgreSQL says it will.
That's not what the spec says it'll do, but the same is true of most of
the extensions, and I don't think people generally consider queries using
those as incorrect.

Re: Bug or stupidity

From
Thomas Hallgren
Date:
Stephan,

>>Perhaps the 8.0 would be a perfect time since it's a change of the major
>>number.
>>
>>
>
>Maybe, but I think it'll be a hard sell without a replacement for the
>delete form that works when it's off.
>
>
I'm not sure I understand this. Apparently you want tables to be added
to the FROM clause of a DELETE statement. Why? If you have more than one
table listed in the FROM clause, how do you know which one that is
subject to DELETE? Surely you're not suggesting that the DELETE should
affect more than one table?

If the WHERE clause that defines the criteria for deletion involves more
than one table, then you'd use a sub select and that has a FROM clause
of its own.

Can you give me an example on what you mean?

>> > In addition, until we have a form of delete which allows a "from"
>> > list, there are some queries which are really more naturally written
>> > in a form similar to add_missing_from
>> > (although "from" lists would be better).
>> >
>>Still, if the query is incorrect, I want to know about it. I don't ever
>>
>>
>
>But, is the query incorrect?  It does what PostgreSQL says it will.
>That's not what the spec says it'll do, but the same is true of most of
>the extensions, and I don't think people generally consider queries using
>those as incorrect.
>
>
I haven't seen any other extension that, when enabled, attempts to
"improve" badly written SQL in a way that potentially gives incorrect
query results. As I said in another post, this is like having a compiler
that instead of complaining about a misspelled variable, adds a new one.

I can give you an example why I think this option is bad:
Assume that you work with some client software. You write your queries
and you test your system. You see some notifications passing by from the
server (perhaps) but you don't pay much attention to them. You get
notifications all the time for other reasons. Nothing to worry about. In
fact, your test system is written to trigger on errors and warnings and
ignore notifications.

So all your tests run fine. You ship to your customers. The customers
starts adding data to tables and finds some strange behavior. It turns
out that everything is caused by tables being added to the FROM clause.
You didn't see the problem in your test because there, the added table
had less than 2 tuples in it.

As I said before, I don't object to the presence of this "option" so
that people that really knows _why_ they enable it can do so, but I
strongly object to having this option enabled by default. I suggest that:

1. Have this option disabled by default.
2. Print WARNING's rather than notifications when tables are added.

Regards,
Thomas Hallgren



Re: Bug or stupidity

From
Martijn van Oosterhout
Date:
On Tue, Oct 26, 2004 at 05:25:57PM +0200, Thomas Hallgren wrote:
> If the WHERE clause that defines the criteria for deletion involves more
> than one table, then you'd use a sub select and that has a FROM clause
> of its own.

Sure, that's what you could do, but it makes the query rather more
complex than it needs to be. For example, consider this statement:

DELETE FROM x WHERE x.a = table.a and x.b > table.b and table.c = 4;

Look, "table" is not declared anywhere, but I can't think of a way to
rewrite this in strict SQL. Maybe:

DELETE FROM x WHERE x.id IN
 (SELECT x.id FROM x, table where x.a = table.a and x.b > table.b and table.c = 4);

Seems like a lot of extra work for no gain. Hope id is never NULL.
Maybe EXISTS would work better?

> I haven't seen any other extension that, when enabled, attempts to
> "improve" badly written SQL in a way that potentially gives incorrect
> query results. As I said in another post, this is like having a compiler
> that instead of complaining about a misspelled variable, adds a new one.

transform_equals_null comes to mind. It's a hack to make 'x = NULL'
work the way people coming from Oracle expect. It "fixes" it to be 'x
IS NULL'.

That is arguably something that could cause unexpected results.

> So all your tests run fine. You ship to your customers. The customers
> starts adding data to tables and finds some strange behavior. It turns
> out that everything is caused by tables being added to the FROM clause.
> You didn't see the problem in your test because there, the added table
> had less than 2 tuples in it.

It has to be exactly one tuple. If there are zero tuples you get zero
output. Cross-joining with an empty table produces no output. You're
shipping a product where people expect to be able to add more rows to a
table, but you never test that?

> As I said before, I don't object to the presence of this "option" so
> that people that really knows _why_ they enable it can do so, but I
> strongly object to having this option enabled by default. I suggest that:
>
> 1. Have this option disabled by default.
> 2. Print WARNING's rather than notifications when tables are added.

If you're not seeing NOTICEs now, what makes you think you'll see
WARNINGs? Every DB interface I've used so far displays the notices
where I can see them. This notice is one of the less useful, there
are other more useful warnings which are much more handy to see...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Bug or stupidity

From
Martijn van Oosterhout
Date:
On Tue, Oct 26, 2004 at 06:21:23PM +0200, Thomas Hallgren wrote:
> Do you consider this overly complex? Compare:
>
> DELETE FROM x WHERE EXISTS (SELECT * FROM table WHERE x.a = table.a and
> x.b > table.b and table.c = 4)
>
> to:
>
> DELETE FROM x, table WHERE x.a = table.a and x.b > table.b and table.c = 4
>
> In the latter, what is it you are deleting? Is it x or table? I'm not at
> all in favor of listing several tables in the FROM clause of a DELETE
> statement (that includes implicitly adding them).

The problem is that in DELETE, there is no FROM clause in the sense
there is with the other commands, the FROM keyword is used for a
different purpose. The FROM clause the tables are automatically added
to does not have an equivalent in the original SQL statement.

I'm in favour of the status quo, exactly the current default behaviour.
That second example you give is confusing and should be disallowed. But
no-one has come up with anything better. Do you have a better
suggestion, other than forbidding the currently allowed syntax?

> >Every DB interface I've used so far displays the notices
> >where I can see them. This notice is one of the less useful, there
> >are other more useful warnings which are much more handy to see...
> >
> >
> Right. Useful "warnings"! Seems you agree that this should be a warning,
> not a notice.

Hmm, I consider a notice to be a warning anyway, something you should
always read. The default log level is notice anyway, so if you're
seeing warnings, you'll see the notices too...

Anyway, I think the reasoning so far is, the default stays as it is
until someone comes up with a non-confusing way of adding a real FROM
clause to DELETEs. Requiring people upgrading to add missing tables in
the FROM for SELECT and UPDATE is one thing. Asking them to rewrite
every DELETE query as a subselect is a bit too far. It would be nice
also because then you could then also use aliases.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Bug or stupidity

From
Thomas Hallgren
Date:
Martijn,
 > Do you have a better
> suggestion, other than forbidding the currently allowed syntax?
>
Yes I do.

We agree that my second example should be disallowed since the semantics
of the FROM clause is different for a DELETE so the "add_missing_from"
is actually not adding to a FROM clause, it is guessing the scope for
the predicate. I assume the same is true for an UPDATE where there is no
FROM at all.

My suggestion is that we rename the add_missing_from to:

update_delete_autoscope

and that this option has no effect on SELECT clauses. It would be more
or less harmless to have it enabled by default.

Perhaps the add_missing_from should remain but then only affect the
SELECT and as disabled by default.

> Anyway, I think the reasoning so far is, the default stays as it is
> until someone comes up with a non-confusing way of adding a real FROM
> clause to DELETEs.
 >
SQL already defines a stright forward way to do that. Consider the
following PostgreSQL syntax:

DELETE FROM first_table
   WHERE first_table.id = second_table.xid AND second_table.foo > 4

in standard SQL this would be:

DELETE FROM first_table x
   WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4)

The number of characters is almost the same in both statements even for
a very simple WHERE clause thanks to aliasing. The benefits of aliasing
increases as the WHERE clause gets more complicated.

For composite keys or other non key based relationships, the EXISTS
clause can be used.

Why confuse people with yet another syntax?

Regards,
Thomas Hallgren

Re: Bug or stupidity

From
Martijn van Oosterhout
Date:
On Wed, Oct 27, 2004 at 12:15:10AM +0200, Thomas Hallgren wrote:
> Martijn,
> > Do you have a better
> >suggestion, other than forbidding the currently allowed syntax?
> >
> Yes I do.
>
> We agree that my second example should be disallowed since the semantics
> of the FROM clause is different for a DELETE so the "add_missing_from"
> is actually not adding to a FROM clause, it is guessing the scope for
> the predicate. I assume the same is true for an UPDATE where there is no
> FROM at all.

Not true, UPDATE in PostgreSQL does allow a from clause. Observe:

# \h update
Command:     UPDATE
Description: update rows of a table
Syntax:
UPDATE [ ONLY ] table SET col = expression [, ...]
    [ FROM fromlist ]
    [ WHERE condition ]

Perfectly reasonable addition, but not strictly SQL standard. Also, the
scope is not guessed, it's totally unambiguous. I avoid the issue
entirly by either never using aliases, or always using aliases, hence
the issue doesn't come up, but that's me.

Anyway, I think there's a confusion in the phrase "from clause". From
the server's point of view, it's the list of tables the query is
working with and this applies to all kinds of queries, DELETE, SELECT
and UPDATE alike. Internally all those queries are processed the same,
it's just what happens to the selected rows that changes. SELECT and
UPDATE allow you to explicitly list tables, DELETE doesn't. The bit
after FROM in a DELETE query is *not* the from clause by this
definition.

But I guess it comes down to to how strictly you want to follow the SQL
standard.

> My suggestion is that we rename the add_missing_from to:
>
> update_delete_autoscope
>
> and that this option has no effect on SELECT clauses. It would be more
> or less harmless to have it enabled by default.

As pointed out above, it's not needed to update. And add_missing_from
currently has no effect on delete, so your suggested option appears to
be merely the inverse of what is already there.

> DELETE FROM first_table x
>   WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4)
>
> The number of characters is almost the same in both statements even for
> a very simple WHERE clause thanks to aliasing. The benefits of aliasing
> increases as the WHERE clause gets more complicated.

The SQL standard (what I can find on the web anyway) doesn't allow an
alias there, and neither does PostgreSQL. Incidently, MS SQL server
allows the following syntax:

DELETE FROM Table1 FROM Table1 INNER JOIN Table2 ON ....

The UPDATE syntax extension I mentioned above is also in MS SQL as far
as I can tell (I've never personally used it). Would adding support for
a from clause there make a difference to you?

Ref: http://www.mvps.org/access/queries/qry0022.htm

> Why confuse people with yet another syntax?

Why confuse people by changing a perfectly usable syntax, that's been
present for years (since the beginning I beleive) and generates NOTICEs
already. The difference between NOTICEs and WARNINGs is that NOTICEs
are expected, a direct consequence of the query, whereas warnings are
unexpected, change each time you run the query. By that definition it
clearly should be a NOTICE.

Anyway, this isn't going anywhere. Neither of us is going to make any
changes to the server. And the core has decided to leave it as is for
the time being. Maybe after 8.0 is released it can be revisited.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Bug or stupidity

From
Sim Zacks
Date:
I didn't see that join syntax in the documentation for delete, thanks
for pointing it out.

MS SQL Server syntax for a delete is a little less confusing, IMHO.

instead of DELETE FROM x WHERE x.a = table.a and x.b > table.b and table.c = 4;
they have DELETE x FROM x join table on x.a = table.a and x.b > table.b and table.c = 4

the table being deleted from is listed  separately, but you can still
have full join syntax (including outer joins) to help with the
deletion.

This is similar to the current PostGreSQL update syntax, except that
the table being updated is not part of the from and therefore can only
be connected through an inner join, not an outer join.

Thank You
Sim Zacks
IT Manager
CompuLab
04-829-0145 - Office
04-832-5251 - Fax

________________________________________________________________________________

On Tue, Oct 26, 2004 at 06:21:23PM +0200, Thomas Hallgren wrote:
> Do you consider this overly complex? Compare:
>
> DELETE FROM x WHERE EXISTS (SELECT * FROM table WHERE x.a = table.a and
> x.b > table.b and table.c = 4)
>
> to:
>
> DELETE FROM x, table WHERE x.a = table.a and x.b > table.b and table.c = 4
>
> In the latter, what is it you are deleting? Is it x or table? I'm not at
> all in favor of listing several tables in the FROM clause of a DELETE
> statement (that includes implicitly adding them).

The problem is that in DELETE, there is no FROM clause in the sense
there is with the other commands, the FROM keyword is used for a
different purpose. The FROM clause the tables are automatically added
to does not have an equivalent in the original SQL statement.

I'm in favour of the status quo, exactly the current default behaviour.
That second example you give is confusing and should be disallowed. But
no-one has come up with anything better. Do you have a better
suggestion, other than forbidding the currently allowed syntax?

> >Every DB interface I've used so far displays the notices
> >where I can see them. This notice is one of the less useful, there
> >are other more useful warnings which are much more handy to see...
> >
> >
> Right. Useful "warnings"! Seems you agree that this should be a warning,
> not a notice.

Hmm, I consider a notice to be a warning anyway, something you should
always read. The default log level is notice anyway, so if you're
seeing warnings, you'll see the notices too...

Anyway, I think the reasoning so far is, the default stays as it is
until someone comes up with a non-confusing way of adding a real FROM
clause to DELETEs. Requiring people upgrading to add missing tables in
the FROM for SELECT and UPDATE is one thing. Asking them to rewrite
every DELETE query as a subselect is a bit too far. It would be nice
also because then you could then also use aliases.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


Re: Bug or stupidity

From
Thomas Hallgren
Date:
Martijn,
I realize that the change I'm proposing might be too complex to be added
in the upcoming 8.0 release. I do find this discussion interesting
though, so please bear with me while I try to tie up some loose ends.

>UPDATE [ ONLY ] table SET col = expression [, ...]
>    [ FROM fromlist ]
>    [ WHERE condition ]
>
>Perfectly reasonable addition, but not strictly SQL standard. Also, the
>scope is not guessed, it's totally unambiguous.
>
Ok, bad choice of words. It's not guessed, and I agree, this is
perfectly reasonable.

>Anyway, I think there's a confusion in the phrase "from clause".
>
There's no confusion. I fully understand the differences. That's why
think that the term 'add_missing_from' is misleading. From a strict
syntax point of view it implies expansion to the statement we both
agreed should be disallowed. The fact that it doesn't actually add a
missing from but rather expands the scope for the predicate is somewhat
confusing. Hence my suggestion that the variable is renamed.

>But I guess it comes down to to how strictly you want to follow the SQL
>standard.
>
>
I think it's OK to deviate from the standard and add features. My whole
argument in this thread is based on the fact that PostgreSQL adds tables
to the FROM clause of a SELECT which may produce incorrect results and
that this "magic" is performed by default.

>>My suggestion is that we rename the add_missing_from to:
>>
>>update_delete_autoscope
>>
>>and that this option has no effect on SELECT clauses. It would be more
>>or less harmless to have it enabled by default.
>>
>>
>As pointed out above, it's not needed to update. And add_missing_from
>currently has no effect on delete, so your suggested option appears to
>be merely the inverse of what is already there.
>
>
What I was trying to say is that: a) since the 'add_missing_from'
affects the predicate scope for DELETE's, UPDATE's, and SELECT's, and
since those statements have different ways of expressing this scope, the
current choice of name is a bit confusing and b) it would be nice if the
variable affected DELETE and UPDATE scopes only. Now you point out that
an UPDATE can have a FROM clause, so let me revise my suggestion and
instead say:

1. Let's add a variable named "autoscope_for_delete" that is enabled by
default and only affects the scope of a DELETE predicate. We do this to
maintain backward compatibility.
2. Let's change so that "add_missing_from" is disabled by default and
doesn't affect the DELETE statement at all.
3. The "autoscope_for_delete" will use generate notices and
"add_missing_from" will generate warnings.

>>DELETE FROM first_table x
>>  WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4)
>>
>>
>The SQL standard (what I can find on the web anyway) doesn't allow an
>alias there, and neither does PostgreSQL.
>
The SQL 2003 draft I have states:

<delete statement: searched> ::=
    DELETE FROM <target table> [ [ AS ] <correlation name> ]
        [ WHERE <search condition> ]

whereas SQL 3 is more elaborated:

<table reference> ::=
      <table name> [ [ AS ] <correlation name>
          [ <left paren> <derived column list> <right paren> ] ]
    | <derived table> [ AS ] <correlation name>
          [ <left paren> <derived column list> <right paren> ]
    | <joined table>

<delete statement: searched> ::=
    DELETE FROM <table reference>
      [ WHERE <search condition> ]

Perhaps PostgreSQL should adopt this?

>Incidently, MS SQL server allows the following syntax:
>
>DELETE FROM Table1 FROM Table1 INNER JOIN Table2 ON ....
>
>The UPDATE syntax extension I mentioned above is also in MS SQL as far
>as I can tell (I've never personally used it). Would adding support for
>a from clause there make a difference to you?
>
>
I'm happy as long as the 'add_missing_from' is disabled or changed so
that it doesn't affect SELECT. And yes, this extension looks good.
Perhaps consider changing the second FROM to USING (mimicking MySQL
instead of MS SQL server). I think it would lessen the risk of
introducing ambiguities in the parser (and it looks better than repeated
FROM's).

Regards,
Thomas Hallgren


Re: Bug or stupidity

From
Bruno Wolff III
Date:
On Wed, Oct 27, 2004 at 22:10:05 +0200,
> 2. Let's change so that "add_missing_from" is disabled by default and
> doesn't affect the DELETE statement at all.

That is supposed to happen. My memory was that 8.0 was the release that
the default was going to change, but if not then it should be 8.1.

I don't see any great reason to change the name at this point. That is
going to just cause more problems.

Re: Bug or stupidity

From
Tom Lane
Date:
Bruno Wolff III <bruno@wolff.to> writes:
> On Wed, Oct 27, 2004 at 22:10:05 +0200,
>> 2. Let's change so that "add_missing_from" is disabled by default and
>> doesn't affect the DELETE statement at all.

> That is supposed to happen. My memory was that 8.0 was the release that
> the default was going to change, but if not then it should be 8.1.

add_missing_from was only added in 7.4; the default behavior goes all
the way back because we inherited it from PostQUEL.  It's probably
premature to flip the factory default after only one release cycle,
especially given the DELETE deficiency.  A reasonable position is to
flip the default one release cycle after we fix the DELETE syntax.

It is interesting that SQL2003 allows an alias on the UPDATE or DELETE
target table; that was definitely not there in SQL99 or earlier.  We'll
want to add that, for sure, but it is just a notational convenience.

There are several threads in the archives about how to fix the DELETE
syntax, but I don't think we ever really got consensus on what keyword
to use to introduce the auxiliary FROM clause.

            regards, tom lane

Re: Bug or stupidity

From
Thomas Hallgren
Date:
Martijn van Oosterhout wrote:

>Sure, that's what you could do, but it makes the query rather more
>complex than it needs to be.
>
>
Do you consider this overly complex? Compare:

DELETE FROM x WHERE EXISTS (SELECT * FROM table WHERE x.a = table.a and
x.b > table.b and table.c = 4)

to:

DELETE FROM x, table WHERE x.a = table.a and x.b > table.b and table.c = 4

In the latter, what is it you are deleting? Is it x or table? I'm not at
all in favor of listing several tables in the FROM clause of a DELETE
statement (that includes implicitly adding them).

>transform_equals_null comes to mind. It's a hack to make 'x = NULL'
>work the way people coming from Oracle expect. It "fixes" it to be 'x
>IS NULL'.
>
>That is arguably something that could cause unexpected results.
>
>
I assume you mean transform_null_equals. If so, you just made my point.
It's disabled by default. Probably for the reason you mention.

>It has to be exactly one tuple. If there are zero tuples you get zero
>output. Cross-joining with an empty table produces no output. You're
>shipping a product where people expect to be able to add more rows to a
>table, but you never test that?
>
>
So how is this relevant to the argument? This is not about the
capabilities of an imaginary test framework. It was just an example!

>>As I said before, I don't object to the presence of this "option" so
>>that people that really knows _why_ they enable it can do so, but I
>>strongly object to having this option enabled by default. I suggest that:
>>
>>1. Have this option disabled by default.
>>2. Print WARNING's rather than notifications when tables are added.
>>
>>
>
>If you're not seeing NOTICEs now, what makes you think you'll see
>WARNINGs?
>
It's not totally uncommon for a test framework to trigger on warnings
and errors (for obvious reasons). My imaginary test actually did just
that (as stated).

>Every DB interface I've used so far displays the notices
>where I can see them. This notice is one of the less useful, there
>are other more useful warnings which are much more handy to see...
>
>
Right. Useful "warnings"! Seems you agree that this should be a warning,
not a notice.

Regards,
Thomas Hallgren





Re: Bug or stupidity

From
Thomas Hallgren
Date:
Martijn,
I realize that the change I'm proposing might be too complex to be added
in the upcoming 8.0 release. I do find this discussion interesting
though, so please bear with me while I try to tie up some loose ends.

>UPDATE [ ONLY ] table SET col = expression [, ...]
>    [ FROM fromlist ]
>    [ WHERE condition ]
>
>Perfectly reasonable addition, but not strictly SQL standard. Also, the
>scope is not guessed, it's totally unambiguous.
>
Ok, bad choice of words. It's not guessed, and I agree, this is
perfectly reasonable.

>Anyway, I think there's a confusion in the phrase "from clause".
>
There's no confusion. I fully understand the differences. That's why
think that the term 'add_missing_from' is misleading. From a strict
syntax point of view it implies expansion to the statement we both
agreed should be disallowed. The fact that it doesn't actually add a
missing from but rather expands the scope for the predicate is somewhat
confusing. Hence my suggestion that the variable is renamed.

>But I guess it comes down to to how strictly you want to follow the SQL
>standard.
>
>
I think it's OK to deviate from the standard and add features. My whole
argument in this thread is based on the fact that PostgreSQL adds tables
to the FROM clause of a SELECT which may produce incorrect results and
that this "magic" is performed by default.

>>My suggestion is that we rename the add_missing_from to:
>>
>>update_delete_autoscope
>>
>>and that this option has no effect on SELECT clauses. It would be more
>>or less harmless to have it enabled by default.
>>
>>
>As pointed out above, it's not needed to update. And add_missing_from
>currently has no effect on delete, so your suggested option appears to
>be merely the inverse of what is already there.
>
>
What I was trying to say is that: a) since the 'add_missing_from'
affects the predicate scope for DELETE's, UPDATE's, and SELECT's, and
since those statements have different ways of expressing this scope, the
current choice of name is a bit confusing and b) it would be nice if the
variable affected DELETE and UPDATE scopes only. Now you point out that
an UPDATE can have a FROM clause, so let me revise my suggestion and
instead say:

1. Let's add a variable named "autoscope_for_delete" that is enabled by
default and only affects the scope of a DELETE predicate. We do this to
maintain backward compatibility.
2. Let's change so that "add_missing_from" is disabled by default and
doesn't affect the DELETE statement at all.
3. The "autoscope_for_delete" will use generate notices and
"add_missing_from" will generate warnings.

>>DELETE FROM first_table x
>>  WHERE x.id IN (SELECT y.xid FROM second_table y WHERE y.foo > 4)
>>
>>
>The SQL standard (what I can find on the web anyway) doesn't allow an
>alias there, and neither does PostgreSQL.
>
The SQL 2003 draft I have states:

<delete statement: searched> ::=
    DELETE FROM <target table> [ [ AS ] <correlation name> ]
        [ WHERE <search condition> ]

whereas SQL 3 is more elaborated:

<table reference> ::=
      <table name> [ [ AS ] <correlation name>
          [ <left paren> <derived column list> <right paren> ] ]
    | <derived table> [ AS ] <correlation name>
          [ <left paren> <derived column list> <right paren> ]
    | <joined table>

<delete statement: searched> ::=
    DELETE FROM <table reference>
      [ WHERE <search condition> ]

Perhaps PostgreSQL should adopt this?

>Incidently, MS SQL server allows the following syntax:
>
>DELETE FROM Table1 FROM Table1 INNER JOIN Table2 ON ....
>
>The UPDATE syntax extension I mentioned above is also in MS SQL as far
>as I can tell (I've never personally used it). Would adding support for
>a from clause there make a difference to you?
>
>
I'm happy as long as the 'add_missing_from' is disabled or changed so
that it doesn't affect SELECT. And yes, this extension looks good.
Perhaps consider changing the second FROM to USING (mimicking MySQL
instead of MS SQL server). I think it would lessen the risk of
introducing ambiguities in the parser (and it looks better than repeated
FROM's).

Regards,
Thomas Hallgren