Re: [GENERAL] why isn't this subquery wrong? - Mailing list pgsql-general

From Steve Crawford
Subject Re: [GENERAL] why isn't this subquery wrong?
Date
Msg-id CAEfWYywDgAGoq1ndCAa=x43-+O74=y9fnQQkdX2CUVf1MUexwg@mail.gmail.com
Whole thread Raw
In response to [GENERAL] why isn't this subquery wrong?  (jonathan vanasco <postgres@2xlp.com>)
List pgsql-general
On Thu, Apr 20, 2017 at 3:17 PM, jonathan vanasco <postgres@2xlp.com> wrote:

I ran into an issue while changing a database schema around.  Some queries still worked, even though I didn't expect them to.

Can anyone explain to me why the following is valid (running 9.6) ?

schema
CREATE TEMPORARY TABLE example_a__data (
foo_id INT,
bar_id INT
);
CREATE TEMPORARY TABLE example_a__rollup_source (
id int primary key,
name varchar(64),
foo_id INT,
check_bool BOOLEAN
);
CREATE TEMPORARY TABLE example_a__rollup AS
SELECT id, name, foo_id
FROM example_a__rollup_source
WHERE check_bool IS TRUE
;

query:
SELECT foo_id
FROM example_a__data
WHERE foo_id IN (SELECT bar_id FROM example_a__rollup)
;

a raw select of `SELECT bar_id FROM example_a__rollup;` will cause an error because bar_id doesn't exist

postgres doesn't raise an error because example_a__data does have a bar_id -- but example_a__rollup doesn't and there's no explicit correlation in the query.

can someone explain why this happens?  i'm guessing there is a good reason -- but I'm unfamiliar with the type of implicit join/queries this behavior is enabling.



There is no requirement in this query that bar_id be in the example_a__rollup table and since it is only in one table it is unambiguous so the server doesn't complain.

It may be explanatory to add a couple records to your example_a_rollup table:

insert into example_a__data values (3,4),(5,6);

Then run a simple select showing what the where clause would see:

SELECT,
   foo_id, 
   (SELECT bar_id FROM example_a__rollup) 
FROM
    example_a__data 
;

 foo_id | bar_id                                                                                           --------+-------- 
     3 | 
     5 |      

bar_id is null because there are no rows in example_a_rollup.

Now add a single record to example_a_rollup:

insert into example_a__rollup (id) values (10);

Rerun the query and you will get:

 foo_id | bar_id  
--------+--------
     3 |      4
     5 |      6


If you add another record to example_a__rollup and run it and you will get:

ERROR:  more than one row returned by a subquery used as an expression


Although the subquery won't work as an expression it would still work in a the where clause but I doubt it will return what you desire. Unfortunately there are lots of ways to write syntactically correct but logically flawed statements.


Cheers,
Steve

pgsql-general by date:

Previous
From: jonathan vanasco
Date:
Subject: [GENERAL] why isn't this subquery wrong?
Next
From: "David G. Johnston"
Date:
Subject: Re: [GENERAL] why isn't this subquery wrong?