Is this a bug or a feature? Column visibility in subquery from outer query - Mailing list pgsql-general

From Paolo Saul
Subject Is this a bug or a feature? Column visibility in subquery from outer query
Date
Msg-id 265789020905172214s3933d333qb9bec616b7082b4b@mail.gmail.com
Whole thread Raw
Responses Re: Is this a bug or a feature? Column visibility in subquery from outer query  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
postgres=# create table public.ps_test_x (x1 oid, x2 oid, x3 oid);
CREATE TABLE
postgres=# create table public.ps_test_y (y1 oid, y2 oid, y3 oid);
CREATE TABLE
postgres=# explain select * from public.ps_test_x where x1 in (select x1 from public.ps_test_y);
                             QUERY PLAN
---------------------------------------------------------------------
 Seq Scan on ps_test_x  (cost=0.00..28462.75 rows=885 width=12)
   Filter: (subplan)
   SubPlan
     ->  Seq Scan on ps_test_y  (cost=0.00..27.70 rows=1770 width=0)
(4 rows)

postgres=# explain select * from public.ps_test_x where x1 in (select x2 from public.ps_test_y);
                             QUERY PLAN
---------------------------------------------------------------------
 Seq Scan on ps_test_x  (cost=0.00..28462.75 rows=885 width=12)
   Filter: (subplan)
   SubPlan
     ->  Seq Scan on ps_test_y  (cost=0.00..27.70 rows=1770 width=0)
(4 rows)

postgres=# explain select * from public.ps_test_x where x1 in (select x3 from public.ps_test_y);
                             QUERY PLAN
---------------------------------------------------------------------
 Seq Scan on ps_test_x  (cost=0.00..28462.75 rows=885 width=12)
   Filter: (subplan)
   SubPlan
     ->  Seq Scan on ps_test_y  (cost=0.00..27.70 rows=1770 width=0)
(4 rows)

postgres=# explain select * from public.ps_test_x where x1 in (select x4 from public.ps_test_y);
ERROR:  column "x4" does not exist
LINE 1: ...elect * from public.ps_test_x where x1 in (select x4 from pu...
                                                             ^
postgres=# explain select * from public.ps_test_x where x1 in (select y1 from public.ps_test_y);
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Hash Join  (cost=36.62..88.66 rows=1770 width=12)
   Hash Cond: (ps_test_x.x1 = ps_test_y.y1)
   ->  Seq Scan on ps_test_x  (cost=0.00..27.70 rows=1770 width=12)
   ->  Hash  (cost=34.12..34.12 rows=200 width=4)
         ->  HashAggregate  (cost=32.12..34.12 rows=200 width=4)
               ->  Seq Scan on ps_test_y  (cost=0.00..27.70 rows=1770 width=4)
(6 rows)

I just want to point out that the sub-query is using a column from the outer query (eg. x1)  without an alias from the table in the outer query. This can lead to a confusion when, for example:

delete from table1 where foreign_id in (select foreign_id from table2)

-- ! table2 does not have the foreign_id column !

This would do a table scan on table1 and delete all its rows.
Why isn't it like:

delete from table1 where foreign_id in (select table1.foreign_id from table2)

where you must specify the outer query's table reference inside the subquery. I suspect the original intent was to use the outer query columns in some processing inside the subquery, which is valid. I'm just wondering why an explicit reference isn't required to distinguish the column. For convenience, or part of the SQL spec?

Cheers!


--Paolo Saul

pgsql-general by date:

Previous
From: John R Pierce
Date:
Subject: Re: Daylight saving time question
Next
From: Stuart Bishop
Date:
Subject: Re: [fulltext]Gin index full scan