Hi!
OS: Debian, Rock Linux
Postgres versions: 13.6, 15.6, 17.0
Setup:
create table partA(id bigint not null, payload bigint);
insert into partA select s, s from generate_series(1,10000) s;
analyze partA;
create index on partA(id);
create index on partA(payload);
create table partB(id bigint not null);
insert into partB select s from generate_series(1,10000) s;
analyze partB;
create index on partB(id);
create view vw as
select id, payload from partA
union all
select id, NULL as payload from partB;
As you can see, we have a view that UNION ALLs two tables with different number of columns. Missing column from partB is stubbed out with a constant NULL.
Now we want to join this view with a small table that has some numbers that we want to find in the `payload` column:
create table some_ids(id bigint not null);
insert into some_ids select s from generate_series(1,2) s;
analyze some_ids;
explain select * from some_ids i join vw on (vw.payload = i.id);
Surprisingly, this does SeqScan on partB in NestedLoops over some_ids with a filter `
some_ids.id = NULL::bigint`:
Nested Loop (cost=0.29..359.16 rows=200 width=24) |
-> Seq Scan on some_ids i (cost=0.00..1.02 rows=2 width=8) |
-> Append (cost=0.29..178.56 rows=51 width=16) |
-> Index Scan using parta_payload_idx on parta (cost=0.29..8.30 rows=1 width=16) |
Index Cond: (payload = i.id) |
-> Seq Scan on partb (cost=0.00..170.00 rows=50 width=16) |
Filter: (i.id = NULL::bigint) |
At the same time `explain select * from vw where payload = 1` correctly skips over partB entirely (the node is eliminated from execution plan), and so does:
explain select * from vw where payload in (1,2);
However, any query that does not use explicit literal values still leads to SeqScan access on partB, such as:
explain select * from vw where payload in (select id from some_ids);
explain select * from vw where payload = ANY(ARRAY(select id from some_ids));
or various forms of joins
Do you know if this is expected/documented, or is this a bug?