Thread: No error when FROM is missing in subquery

No error when FROM is missing in subquery

From

"Nikolay Samokhvalov"

Date:

18 December 2006, 19:01:51

Following query is considered as correct, no "missing from" error has
been reported (so, entire table will be updated and "on update"
triggers will be fired for every row):

update item set obj_id = obj_id
where obj_id in (select obj_id where item_point is null order by
obj_modified limit 10)

Is it a bug? If no, maybe to produce warning in such cases?

--
Best regards,
Nikolay

Re: No error when FROM is missing in subquery

From

"Nikolay Samokhvalov"

Date:

18 December 2006, 19:15:39

ok, sorry, I've realized that it's yet another example of "outer
reference", Tom will say "read any SQL book" again :-)

http://archives.postgresql.org/pgsql-bugs/2006-12/msg00115.php

On 12/19/06, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
> Following query is considered as correct, no "missing from" error has
> been reported (so, entire table will be updated and "on update"
> triggers will be fired for every row):
>
> update item set obj_id = obj_id
> where obj_id in (select obj_id where item_point is null order by
> obj_modified limit 10)
>
> Is it a bug? If no, maybe to produce warning in such cases?
>
> --
> Best regards,
> Nikolay
>


--
Best regards,
Nikolay

Re: No error when FROM is missing in subquery

From

"Jaime Casanova"

Date:

18 December 2006, 21:19:20

> On 12/19/06, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
> > Following query is considered as correct, no "missing from" error has
> > been reported (so, entire table will be updated and "on update"
> > triggers will be fired for every row):
> >
> > update item set obj_id = obj_id
> > where obj_id in (select obj_id where item_point is null order by
> > obj_modified limit 10)
> >
> > Is it a bug? If no, maybe to produce warning in such cases?
>
On 12/18/06, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
> ok, sorry, I've realized that it's yet another example of "outer
> reference", Tom will say "read any SQL book" again :-)
>
> http://archives.postgresql.org/pgsql-bugs/2006-12/msg00115.php
>

not really... AFAIK, the FROM clause is mandatory per SQL... older
releases of postgres fill the missing from clause if it was easy to
determine, in recent releases it's mandatory unless you specify the
opposite in postgresql.conf with the add_missing_from parameter

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
                                       Richard Cook

Re: No error when FROM is missing in subquery

From

"Thomas H."

Date:

18 December 2006, 23:26:56

>> > Is it a bug? If no, maybe to produce warning in such cases?

oups. just thumbled over this as well when i forgot a FROM in a WHERE ... IN
(....) and damaged quite some data. the bad query went like this:

SELECT * FROM movies.names WHERE mov_id IN (SELECT DISTINCT mov_id WHERE
mov_name like '%, %' LIMIT 2)

the subselect is missing a FROM <table>. in that case, pgsql seemed to also
ignore the LIMIT 2 and returned 3706 records out of ~130000... no clue which
ones :-/

- thomas

Re: No error when FROM is missing in subquery

From

"Jaime Casanova"

Date:

18 December 2006, 23:43:39

On 12/18/06, Thomas H. <me@alternize.com> wrote:
> >> > Is it a bug? If no, maybe to produce warning in such cases?
>
> oups. just thumbled over this as well when i forgot a FROM in a WHERE ... IN
> (....) and damaged quite some data. the bad query went like this:
>
> SELECT * FROM movies.names WHERE mov_id IN (SELECT DISTINCT mov_id WHERE
> mov_name like '%, %' LIMIT 2)
>
> the subselect is missing a FROM <table>. in that case, pgsql seemed to also
> ignore the LIMIT 2 and returned 3706 records out of ~130000...

and the UPDATE was?

also the limit applies only to the subselect, it has nothing to do
with the upper query so the upper query can return more than number of
rows specified in the subselect...

> no clue which ones :-/
>

LIMIT is often meaningfull only in conjuction with ORDER BY

--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
                                       Richard Cook

Re: No error when FROM is missing in subquery

From

"Thomas H."

Date:

18 December 2006, 23:57:51

>> oups. just thumbled over this as well when i forgot a FROM in a WHERE ...
>> IN
>> (....) and damaged quite some data. the bad query went like this:
>>
>> SELECT * FROM movies.names WHERE mov_id IN (SELECT DISTINCT mov_id WHERE
>> mov_name like '%, %' LIMIT 2)
>>
>> the subselect is missing a FROM <table>. in that case, pgsql seemed to
>> also
>> ignore the LIMIT 2 and returned 3706 records out of ~130000...
>
> and the UPDATE was?

that was done by the application with the returned recordset.

> also the limit applies only to the subselect, it has nothing to do
> with the upper query so the upper query can return more than number of
> rows specified in the subselect...

IF the subquery would only have returned 2 ids, then there would be at most
like +/-10 records affected. each mov_id can hold one or more (usuals up to
5) names. but here, the subquery seemed to return ~3700 distinct mov_ids,
thus around 37000 names where damaged by the following programmatical
updates instead of only a hands full...

> LIMIT is often meaningfull only in conjuction with ORDER BY

yep but not here. all i wanted to do is to get names from 2 movies and run
an *observed* edit on them.

what did pgsql actually do with that subquery? did it return all records for
which mov_name match '%, %'?

- thomas

Re: No error when FROM is missing in subquery

From

"Jaime Casanova"

Date:

19 December 2006, 00:09:58

On 12/18/06, Thomas H. <me@alternize.com> wrote:
> >> oups. just thumbled over this as well when i forgot a FROM in a WHERE ...
> >> IN
> >> (....) and damaged quite some data. the bad query went like this:
> >>
> >> SELECT * FROM movies.names WHERE mov_id IN (SELECT DISTINCT mov_id WHERE
> >> mov_name like '%, %' LIMIT 2)
> >>
> >> the subselect is missing a FROM <table>. in that case, pgsql seemed to
> >> also
> >> ignore the LIMIT 2 and returned 3706 records out of ~130000...
> >
> > and the UPDATE was?
>
> that was done by the application with the returned recordset.
>
> > also the limit applies only to the subselect, it has nothing to do
> > with the upper query so the upper query can return more than number of
> > rows specified in the subselect...
>
> IF the subquery would only have returned 2 ids, then there would be at most
> like +/-10 records affected. each mov_id can hold one or more (usuals up to
> 5) names. but here, the subquery seemed to return ~3700 distinct mov_ids,
> thus around 37000 names where damaged by the following programmatical
> updates instead of only a hands full...
>

have you tested the query in psql?
what results do you get?


--
regards,
Jaime Casanova

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs and the universe trying
to produce bigger and better idiots.
So far, the universe is winning."
                                       Richard Cook

Re: No error when FROM is missing in subquery

From

"Thomas H."

Date:

19 December 2006, 01:05:53

>> >> SELECT * FROM movies.names WHERE mov_id IN (SELECT DISTINCT mov_id
>> >> WHERE
>> >> mov_name like '%, %' LIMIT 2)
>>
>> IF the subquery would only have returned 2 ids, then there would be at
>> most
>> like +/-10 records affected. each mov_id can hold one or more (usuals up
>> to
>> 5) names. but here, the subquery seemed to return ~3700 distinct mov_ids,
>> thus around 37000 names where damaged by the following programmatical
>> updates instead of only a hands full...
>>
>
> have you tested the query in psql?
> what results do you get?

the data is damaged so the result isn't the same... regenearting it now from
a backup.

from first tests i would say it returned records with names that match the
WHERE in the subselect. i guess what happened is: it took each record in
movies.names, then run the subquery for that record which resulted in "WHERE
mov_id IN (mov_id)" = true for records with a ', ' in the name and "WHERE
mov_id IN ()" = false for all others.

- thomas

Re: No error when FROM is missing in subquery

From

Tom Lane

Date:

19 December 2006, 01:38:48

"Thomas H." <me@alternize.com> writes:
> SELECT * FROM movies.names WHERE mov_id IN (SELECT DISTINCT mov_id WHERE
> mov_name like '%, %' LIMIT 2)

> the subselect is missing a FROM <table>. in that case, pgsql seemed to also
> ignore the LIMIT 2

It didn't "ignore" anything.  Each execution of the sub-select returned
1 row, containing the current mov_id from the outer query.  So basically
this would've selected everything passing the LIKE condition.

            regards, tom lane

Re: No error when FROM is missing in subquery

From

mike

Date:

19 December 2006, 01:40:57

Also check that the mov_id column exists in the table/view that you are
running the SELECT DISTINCT against.

Pgsql does not throw an error (at least prior to 8.2) if the column
referenced by the select statement for the IN clause does not exist.  It
will run only SELECT * FROM movies.names in this case.

Mike

On Tue, 2006-12-19 at 06:01 +0100, Thomas H. wrote:
> >> >> SELECT * FROM movies.names WHERE mov_id IN (SELECT DISTINCT mov_id
> >> >> WHERE
> >> >> mov_name like '%, %' LIMIT 2)
> >>
> >> IF the subquery would only have returned 2 ids, then there would be at
> >> most
> >> like +/-10 records affected. each mov_id can hold one or more (usuals up
> >> to
> >> 5) names. but here, the subquery seemed to return ~3700 distinct mov_ids,
> >> thus around 37000 names where damaged by the following programmatical
> >> updates instead of only a hands full...
> >>
> >
> > have you tested the query in psql?
> > what results do you get?
>
> the data is damaged so the result isn't the same... regenearting it now from
> a backup.
>
> from first tests i would say it returned records with names that match the
> WHERE in the subselect. i guess what happened is: it took each record in
> movies.names, then run the subquery for that record which resulted in "WHERE
> mov_id IN (mov_id)" = true for records with a ', ' in the name and "WHERE
> mov_id IN ()" = false for all others.
>
> - thomas
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

Re: No error when FROM is missing in subquery

From

Tom Lane

Date:

19 December 2006, 01:48:29

mike <mike@thegodshalls.com> writes:
> Pgsql does not throw an error (at least prior to 8.2) if the column
> referenced by the select statement for the IN clause does not exist.

My, there's a lot of misinformation in this thread.

The reason there's no error thrown is that the reference to mov_id in
the sub-SELECT is a perfectly legal outer reference to the mov_id column
available from the upper SELECT.  If the column truly did not exist
anywhere in the tables used in the query, it would have thrown an error.

            regards, tom lane

Re: No error when FROM is missing in subquery

From

mike

Date:

19 December 2006, 03:13:04

You are right as usual.... My apologies yet again. I have wrongly
assumed that the lower statement would run first since it is enclosed in
parenthesis.

On Tue, 2006-12-19 at 00:48 -0500, Tom Lane wrote:
> mike <mike@thegodshalls.com> writes:
> > Pgsql does not throw an error (at least prior to 8.2) if the column
> > referenced by the select statement for the IN clause does not exist.
>
> My, there's a lot of misinformation in this thread.
>
> The reason there's no error thrown is that the reference to mov_id in
> the sub-SELECT is a perfectly legal outer reference to the mov_id column
> available from the upper SELECT.  If the column truly did not exist
> anywhere in the tables used in the query, it would have thrown an error.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

Re: No error when FROM is missing in subquery

From

"Kevin Grittner"

Date:

19 December 2006, 11:20:12

>>> On Mon, Dec 18, 2006 at 11:48 PM, in message
<15735.1166507302@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> The reason there's no error thrown is that the reference to mov_id
in
> the sub- SELECT is a perfectly legal outer reference to the mov_id
column
> available from the upper SELECT.  If the column truly did not exist
> anywhere in the tables used in the query, it would have thrown an
error.

It's easy to see how it resolves the column references; but the syntax
is still not compliant with the SQL standard.  The FROM clause is
required.  We could call it a PostgreSQL extension, but I'm curious if
anyone actually finds it useful.  So far it just seems to provide an
opportunity for error.

-Kevin

*scalar_subquery
    :   subquery
    ;

*subquery
    :   LEFT_PAREN query_expression RIGHT_PAREN
    ;

*query_expression
    :   non_join_query_expression
    |   joined_table
    ;

*non_join_query_expression
    :   non_join_query_term
    |   query_expression TK_union [ TK_all ]
        [ corresponding_spec ] query_term
    |   query_expression TK_except [ TK_all ]
        [ corresponding_spec ] query_term
    ;

*non_join_query_term
    :   non_join_query_primary
    |   query_term TK_intersect [ TK_all ]
        [ corresponding_spec ] query_primary
    ;

*non_join_query_primary
    :   simple_table
    |   LEFT_PAREN non_join_query_expression RIGHT_PAREN
    ;

*simple_table
    :   query_specification
    |   table_value_constructor
    |   explicit_table
    ;

*query_specification
    :   TK_select [ set_quantifier ] select_list table_expression
    ;

*select_list
    :   ASTERISK
    |   select_sublist [ { COMMA select_sublist }... ]
    ;

*select_sublist
    :   derived_column
    |   qualifier PERIOD ASTERISK
    ;

*derived_column
    :   value_expression [ as_clause ]
    ;

*as_clause
    :   [ TK_as ] column_name
    ;

*table_expression
    :   from_clause
        [ where_clause ]
        [ group_by_clause ]
        [ having_clause ]
    ;

*from_clause
    :   TK_from table_reference
        [ { COMMA table_reference }... ]
    ;

Re: No error when FROM is missing in subquery

From

Tom Lane

Date:

19 December 2006, 11:29:38

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> It's easy to see how it resolves the column references; but the syntax
> is still not compliant with the SQL standard.  The FROM clause is
> required.  We could call it a PostgreSQL extension, but I'm curious if
> anyone actually finds it useful.

You've never done "SELECT some_expression" ?

            regards, tom lane

Re: No error when FROM is missing in subquery

From

"Kevin Grittner"

Date:

19 December 2006, 11:38:18

>>> On Tue, Dec 19, 2006 at  9:23 AM, in message
<24047.1166541794@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> It's easy to see how it resolves the column references; but the
syntax
>> is still not compliant with the SQL standard.  The FROM clause is
>> required.  We could call it a PostgreSQL extension, but I'm curious
if
>> anyone actually finds it useful.
>
> You've never done "SELECT some_expression" ?

Not, as far as I can recall, in the ANSI SQL spec.  Did I miss
something?

I'm having trouble seeing how it is a useful construct in the context
of a scalar subquery.  A non-standard extension should be useful in some
way.

-Kevin

Re: No error when FROM is missing in subquery

From

Tom Lane

Date:

19 December 2006, 12:00:56

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> I'm having trouble seeing how it is a useful construct in the context
> of a scalar subquery.  A non-standard extension should be useful in some
> way.

There is 0 chance that we'd disallow it at the top level after allowing
it all these years.  And probably not even just top-level; consider
    select 1 union all select 2 union all select 3;
which has been the recommended workaround up to 8.2 for our lack of
multi-row VALUES lists.  We will certainly break a lot of code if we
disallow that.  So now you have to make a case why we should make a
non-orthogonal distinction between certain subqueries and other
subqueries.

As for potential usefulness, consider a set-returning function invoked
in the targetlist: it makes perfect sense to do
    WHERE foo IN (SELECT mysrf(...))
and maybe even add an ORDER BY/LIMIT to that.  Yeah, no doubt this is
easy to change to putting the SRF in FROM, but you can bet there are
Postgres applications out there today using it; we have never officially
deprecated this way of using an SRF.

I also recall having advised people in years past to use dummy
(SELECT expr) sub-selects to work around planning issues.  I don't
recall at the moment whether any of those issues are still live, but
again you can bet the construct is still in some applications' SQL.

So basically we get to introduce some arbitrary non-orthogonality, plus
break an unknown number of applications, in order to block off a type
of user error that AFAIR has never been complained of before in all the
years I've been working with Postgres.  Sorry, no sale ...

            regards, tom lane

Re: No error when FROM is missing in subquery

From

"Kevin Grittner"

Date:

19 December 2006, 12:29:26

>>> On Tue, Dec 19, 2006 at  9:58 AM, in message
<26199.1166543928@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> I'm having trouble seeing how it is a useful construct in the
context
>> of a scalar subquery.  A non- standard extension should be useful in
some
>> way.
>
> There is 0 chance that we'd disallow it at the top level after
allowing
> it all these years.

I wouldn't want to eliminate it there -- it is clearly a useful
extension to the standard at the top level.

> And probably not even just top- level; consider
>     select 1 union all select 2 union all select 3;
> which has been the recommended workaround up to 8.2 for our lack of
> multi- row VALUES lists.  We will certainly break a lot of code if
we
> disallow that.

Point taken.

> So now you have to make a case why we should make a
> non- orthogonal distinction between certain subqueries and other
> subqueries.

Well, I don't think of the terms for set operations as subqueries, and
there are other differences already in what is allowed for a query term
and a subquery.  Arguably there is more risk of error of the type
recently reported where you are in a scalar subquery context.

> As for potential usefulness, consider a set- returning function
invoked
> in the targetlist: it makes perfect sense to do
>     WHERE foo IN (SELECT mysrf(...))
> and maybe even add an ORDER BY/LIMIT to that.

That is sufficient to answer my concerns.  I tend to operate from the
context of the standard, because we have our own ANSI based parser which
generates portable Java query classes.  ORDER BY and LIMIT are not
allowed in the subqueries in the standard but are obviously useful
extensions.  The missing FROM then adds value to the other extensions.
Case closed.  Thanks.

By the way, when I read my previous message it struck me that it could
be taken with a tone I didn't intend.  That was the result of whipping
it out quickly without taking sufficient time to review it.  Sorry; no
offense was intended.  I'll try to avoid doing that again.

-Kevin