Thread: Weird (?) problem with order of conditions in SELECT

Weird (?) problem with order of conditions in SELECT

From
"Mark, Terry"
Date:
Hello all,

I have encountered a weird problem I  can't seem to understand.  It involves
a correlated subquery, where the rows returned seem to depend upon the order
I specify my conditions.  I can't see why the order should be important
(except maybe for performance)

To demonstrate I created a toy data set and query (note, both are completely
contrived so neither really 'makes sense' beyond this context).  I present
first two queries which return different results, but differ *only in the
order I specify conditions*, along with the results of the queries.  (The
toy data set is at the very end of the email)

SELECT c.score FROM c
WHERE c.score >= (SELECT MAX(score) AS score FROM c
            WHERE a.name='nugget'
            AND a.job='programmer'
            AND a.a_id=b.a_id AND c.b_id = b.b_id)
AND a.name='nugget'
AND a.job='programmer'
AND a.a_id=b.a_id
AND c.b_id = b.b_id;

 score
-------
  2500
(1 row)

* NOTE, this query returns only one row.


SELECT c.score FROM c
WHERE a.name='nugget'
AND a.job='programmer'
AND a.a_id=b.a_id
AND c.b_id = b.b_id
AND c.score >= (SELECT MAX(score) AS score FROM c
            WHERE a.name='nugget'
            AND a.job='programmer'
            AND a.a_id=b.a_id
            AND c.b_id = b.b_id);

 score
-------
   100
  2500
(2 rows)


* NOTE, now TWO rows are returned, even though all that has happened is I've
changed the order of the conditions.

What's going on here ? Why should the order of conditions be important ??
Any insight would be greatly appreciated.

Thanks,
terry

CREATE TABLE people (name text not null PRIMARY KEY, age int);
insert into people values ('nugget', 33);
insert into people values ('lisa', 32);
insert into people values ('larry', 28);

CREATE TABLE a (a_id INT NOT NULL PRIMARY KEY, job TEXT NOT NULL, name TEXT
NOT NULL, FOREIGN KEY (name) REFERENCES people(name));
INSERT INTO a VALUES (1, 'programmer', 'nugget');
INSERT INTO a VALUES (2, 'programmer', 'lisa');
INSERT INTO a VALUES (2, 'secretary', 'lisa');
INSERT INTO a VALUES (3, 'secretary', 'lisa');
INSERT INTO a VALUES (4, 'student', 'larry');
INSERT INTO a VALUES (5, 'programmer', 'nugget');

CREATE TABLE b (b_id INT NOT NULL PRIMARY KEY, a_id INT NOT NULL, FOREIGN
KEY (a_id) REFERENCES a(a_id));
INSERT INTO b (a_id, b_id) VALUES (1,1);
INSERT INTO b (a_id, b_id) VALUES (2,2);
INSERT INTO b (a_id, b_id) VALUES (3,3);
INSERT INTO b (a_id, b_id) VALUES (4,4);
INSERT INTO b (a_id, b_id) VALUES (5,5);

CREATE TABLE c (c_id INT NOT NULL PRIMARY KEY, b_id INT NOT NULL, score INT
NOT NULL, FOREIGN KEY (b_id) REFERENCES b(b_id));
INSERT INTO c (b_id, c_id, score) VALUES (1,1,100);
INSERT INTO c (b_id, c_id, score) VALUES (2,2,400);
INSERT INTO c (b_id, c_id, score) VALUES (3,3,900);
INSERT INTO c (b_id, c_id, score) VALUES (4,4,1600);
INSERT INTO c (b_id, c_id,score) VALUES (5,5,2500);


Re: Weird (?) problem with order of conditions in SELECT

From
Tom Lane
Date:
"Mark, Terry" <tmark@amgen.com> writes:
> I have encountered a weird problem I  can't seem to understand.  It involves
> a correlated subquery, where the rows returned seem to depend upon the order
> I specify my conditions.  I can't see why the order should be important
> (except maybe for performance)

> SELECT c.score FROM c
> WHERE c.score >= (SELECT MAX(score) AS score FROM c
>             WHERE a.name='nugget'
>             AND a.job='programmer'
>             AND a.a_id=b.a_id AND c.b_id = b.b_id)
> AND a.name='nugget'
> AND a.job='programmer'
> AND a.a_id=b.a_id
> AND c.b_id = b.b_id;

> SELECT c.score FROM c
> WHERE a.name='nugget'
> AND a.job='programmer'
> AND a.a_id=b.a_id
> AND c.b_id = b.b_id
> AND c.score >= (SELECT MAX(score) AS score FROM c
>             WHERE a.name='nugget'
>             AND a.job='programmer'
>             AND a.a_id=b.a_id
>             AND c.b_id = b.b_id);

This is a little less mysterious if you run it under 7.1, because 7.1
emits some warning notices:

NOTICE:  Adding missing FROM-clause entry in subquery for table "a"
NOTICE:  Adding missing FROM-clause entry in subquery for table "b"
NOTICE:  Adding missing FROM-clause entry for table "a"
NOTICE:  Adding missing FROM-clause entry for table "b"
 score
-------
  2500
(1 row)

NOTICE:  Adding missing FROM-clause entry for table "a"
NOTICE:  Adding missing FROM-clause entry for table "b"
 score
-------
   100
  2500
(2 rows)

From this we can infer that Postgres is actually interpreting the first
query as

SELECT c.score FROM a,b,c
WHERE c.score >= (SELECT MAX(score) AS score FROM a,b,c
            WHERE a.name='nugget'
            AND a.job='programmer'
            AND a.a_id=b.a_id AND c.b_id = b.b_id)
AND a.name='nugget'
AND a.job='programmer'
AND a.a_id=b.a_id
AND c.b_id = b.b_id;

whereas the second one is being interpreted as

SELECT c.score FROM a,b,c
WHERE a.name='nugget'
AND a.job='programmer'
AND a.a_id=b.a_id
AND c.b_id = b.b_id
AND c.score >= (SELECT MAX(score) AS score FROM c
            WHERE a.name='nugget'
            AND a.job='programmer'
            AND a.a_id=b.a_id
            AND c.b_id = b.b_id);

That is, in the second case the sub-select's references to A and B are
being taken as outer references to the current A and B rows of the outer
query, whereas in the first case the sub-select is interpreted as a
completely independent query.

I am not sure which interpretation you were actually intending.

This example shows one reason why the "implicit FROM item" feature of
Postgres is confusing and has come to be deprecated: it's not always
clear which FROM list an implicit item should be added to.  We've
started to emit a warning about use of this feature in 7.1, and perhaps
someday it will be removed entirely.

            regards, tom lane

RE: Weird (?) problem with order of conditions in SELECT

From
"Mark, Terry"
Date:
Many thanks to Tom for his reply.  My intention was to have the subquery
treated as a completely independent query.

Should I understand, then, that by explicitly naming all the involved table
portions in the subequery, that the subquery is guaranteed to be treated
independently ?

terry
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, March 20, 2001 11:42 AM
To: Mark, Terry
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Weird (?) problem with order of conditions in
SELECT


"Mark, Terry" <tmark@amgen.com> writes:
> I have encountered a weird problem I  can't seem to understand.  It
involves
> a correlated subquery, where the rows returned seem to depend upon the
order
> I specify my conditions.  I can't see why the order should be important
> (except maybe for performance)

> SELECT c.score FROM c
> WHERE c.score >= (SELECT MAX(score) AS score FROM c
>             WHERE a.name='nugget'
>             AND a.job='programmer'
>             AND a.a_id=b.a_id AND c.b_id = b.b_id)
> AND a.name='nugget'
> AND a.job='programmer'
> AND a.a_id=b.a_id
> AND c.b_id = b.b_id;

> SELECT c.score FROM c
> WHERE a.name='nugget'
> AND a.job='programmer'
> AND a.a_id=b.a_id
> AND c.b_id = b.b_id
> AND c.score >= (SELECT MAX(score) AS score FROM c
>             WHERE a.name='nugget'
>             AND a.job='programmer'
>             AND a.a_id=b.a_id
>             AND c.b_id = b.b_id);

This is a little less mysterious if you run it under 7.1, because 7.1
emits some warning notices:

NOTICE:  Adding missing FROM-clause entry in subquery for table "a"
NOTICE:  Adding missing FROM-clause entry in subquery for table "b"
NOTICE:  Adding missing FROM-clause entry for table "a"
NOTICE:  Adding missing FROM-clause entry for table "b"
 score
-------
  2500
(1 row)

NOTICE:  Adding missing FROM-clause entry for table "a"
NOTICE:  Adding missing FROM-clause entry for table "b"
 score
-------
   100
  2500
(2 rows)

From this we can infer that Postgres is actually interpreting the first
query as

SELECT c.score FROM a,b,c
WHERE c.score >= (SELECT MAX(score) AS score FROM a,b,c
            WHERE a.name='nugget'
            AND a.job='programmer'
            AND a.a_id=b.a_id AND c.b_id = b.b_id)
AND a.name='nugget'
AND a.job='programmer'
AND a.a_id=b.a_id
AND c.b_id = b.b_id;

whereas the second one is being interpreted as

SELECT c.score FROM a,b,c
WHERE a.name='nugget'
AND a.job='programmer'
AND a.a_id=b.a_id
AND c.b_id = b.b_id
AND c.score >= (SELECT MAX(score) AS score FROM c
            WHERE a.name='nugget'
            AND a.job='programmer'
            AND a.a_id=b.a_id
            AND c.b_id = b.b_id);

That is, in the second case the sub-select's references to A and B are
being taken as outer references to the current A and B rows of the outer
query, whereas in the first case the sub-select is interpreted as a
completely independent query.

I am not sure which interpretation you were actually intending.

This example shows one reason why the "implicit FROM item" feature of
Postgres is confusing and has come to be deprecated: it's not always
clear which FROM list an implicit item should be added to.  We've
started to emit a warning about use of this feature in 7.1, and perhaps
someday it will be removed entirely.

            regards, tom lane