Thread: BUG #5898: Nested "in" clauses hide bad column names

BUG #5898: Nested "in" clauses hide bad column names

From

"Scott Dunbar"

Date:

22 February 2011, 13:26:28

The following bug has been logged online:

Bug reference:      5898
Logged by:          Scott Dunbar
Email address:      scott@xigole.com
PostgreSQL version: 9.0.3
Operating system:   Ubuntu 10.10
Description:        Nested "in" clauses hide bad column names
Details:

I have a nested in clause like:

select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));

However, in this example, there is no column named respondent_id in the
chat_session table.  But the query runs and, indeed, returns all of the rows
in respondent.  Since this was then part of another nested in it deleted far
more that it was supposed to.

Re: BUG #5898: Nested "in" clauses hide bad column names

From

Tom Lane

Date:

22 February 2011, 13:45:12

"Scott Dunbar" <scott@xigole.com> writes:
> I have a nested in clause like:

> select respondent_id from respondent where respondent_id in (select
> respondent_id from chat_session where project_id in (select project_id from
> project where company_id = 4));

> However, in this example, there is no column named respondent_id in the
> chat_session table.

Probably there is one in respondent, though?  This behavior is not a bug
--- what you have there is an outer reference, and it is working exactly
as specified by the SQL standard.  Sub-selects would be a whole lot less
useful if they couldn't refer to variables of the outer query.

            regards, tom lane

Re: BUG #5898: Nested "in" clauses hide bad column names

From

Scott Dunbar

Date:

22 February 2011, 14:15:55

Yes, you're correct.  I guess this makes sense but it does seem strange
that I can enter garbage in a query but it still runs.  And in my case
the output from this (the entire table) was then used in a delete
statement that toasted the entire table.  Allowing bogus SQL just seems
"wrong" but I do understand what's going on.

Thanks for your help.


On 02/22/2011 10:45 AM, Tom Lane wrote:
> "Scott Dunbar"<scott@xigole.com>  writes:
>> I have a nested in clause like:
>> select respondent_id from respondent where respondent_id in (select
>> respondent_id from chat_session where project_id in (select project_id from
>> project where company_id = 4));
>> However, in this example, there is no column named respondent_id in the
>> chat_session table.
> Probably there is one in respondent, though?  This behavior is not a bug
> --- what you have there is an outer reference, and it is working exactly
> as specified by the SQL standard.  Sub-selects would be a whole lot less
> useful if they couldn't refer to variables of the outer query.
>
>             regards, tom lane


--
Scott Dunbar
Xigole Systems, Inc.
Enterprise software consulting, development, and hosting
303·667·6343

Re: BUG #5898: Nested "in" clauses hide bad column names

From

"Kevin Grittner"

Date:

22 February 2011, 14:33:41

Scott Dunbar <scott@xigole.com> wrote:

> I guess this makes sense but it does seem strange that I can enter
> garbage in a query but it still runs.

It wasn't garbage.

> And in my case the output from this (the entire table) was then
> used in a delete statement that toasted the entire table.

I can suggest a few ways to protect yourself in such situations.

(1)  I generally run any DELETE statement against data I care about
as a SELECT first.  It's generally pretty easy to write it as a
SELECT tbl.* FROM which can be converted to DELETE FROM tbl after
reviewing what matches.

(2)  When in doubt, use BEGIN; before running the statement.  You
can review the count, run SELECTs to look at the results, etc.,
before running COMMIT; to make it "stick".

(3)  Less convenient, but sometimes useful, is to EXPLAIN your query
before actually running it.  If you do that with your delete, you'll
see the criterion applied to a table other than what you were
expecting, which might alert you to the problem.  The estimated row
count at the top level of the plan might be another red flag.

> Allowing bogus SQL just seems "wrong" but I do understand what's
> going on.

If you did you wouldn't call a well formed, unambiguous, standard-
conforming statement bogus.  It did exactly what you said; just not
what you meant.  It pays to be a bit paranoid when running ad hoc
DML in case you accidentally don't say what you mean.

-Kevin

Re: BUG #5898: Nested "in" clauses hide bad column names

From

Eric Schwarzenbach

Date:

22 February 2011, 17:50:31

Using table aliases prevents such problems by disambiguating the column
names. For example, if you had table aliases in just one part of this
query as in the below, you would have gotten an error instead of
deleting all those rows:

select respondent_id from respondent where respondent_id in (select
cs.respondent_id from chat_session cs where cs.project_id in (select project_id from
project where company_id = 4));


Cheers,

Eric



On 2/22/2011 1:07 PM, Scott Dunbar wrote:
> Yes, you're correct.  I guess this makes sense but it does seem
> strange that I can enter garbage in a query but it still runs.  And in
> my case the output from this (the entire table) was then used in a
> delete statement that toasted the entire table.  Allowing bogus SQL
> just seems "wrong" but I do understand what's going on.
>
> Thanks for your help.
>
>
> On 02/22/2011 10:45 AM, Tom Lane wrote:
>> "Scott Dunbar"<scott@xigole.com>  writes:
>>> I have a nested in clause like:
>>> select respondent_id from respondent where respondent_id in (select
>>> respondent_id from chat_session where project_id in (select project_id from
>>> project where company_id = 4));
>>> However, in this example, there is no column named respondent_id in the
>>> chat_session table.
>> Probably there is one in respondent, though?  This behavior is not a bug
>> --- what you have there is an outer reference, and it is working exactly
>> as specified by the SQL standard.  Sub-selects would be a whole lot less
>> useful if they couldn't refer to variables of the outer query.
>>
>>             regards, tom lane
>
>
> --
> Scott Dunbar
> Xigole Systems, Inc.
> Enterprise software consulting, development, and hosting
> 303Â·667Â·6343