I'm getting the idea that looking for unique indexes on the sub query is not worth the hassle for now. Don't get me wrong, they'd be nice to have, but I just think that it's a less common use case and these are more likely to have been pulled up anyway.
Unless there's a better way, I think I'm going to spend the time looking into inner joins instead.
I've been working on adding join removal for join types other than left outer joins.
The attached patch allows join removals for both sub queries with left joins and also semi joins where a foreign key can prove the existence of the record.
My longer term plan is to include inner joins too, but now that I have something to show for semi joins, I thought this would be a good time to post the patch just in case anyone can see any show stopper's with using foreign keys this way.
So with the attached you can do:
CREATE TABLE b (id INT NOT NULL PRIMARY KEY);
CREATE TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT NOT NULL REFERENCES b(id));
EXPLAIN (COSTS OFF)
SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
QUERY PLAN
---------------
Seq Scan on a
(1 row)
I think anti joins could use the same infrastructure but I'm not quite sure yet how to go about replacing the join with something like WHERE false.
I do think semi and anti joins are a far less useful case for join removals as inner joins are, but if we're already loading the foreign key constraints at plan time, then it seems like something that might be worth while checking.
Oh, quite likely the code that loads the foreign key constraints needs more work and probably included in the rel cache, but I don't want to go and to that until I get some feedback on the work so far.