I'm currently in the early stages of looking into expanding join removals.
Currently left outer joins can be removed if none of the columns of the table are required for anything and the table being joined is a base table that contains a unique index on all columns in the join clause.
The case I would like to work on is to allow sub queries where the query is grouped by or distinct on all of the join columns.
Take the following as an example:
CREATE TABLE products (productid integer NOT NULL, code character varying(32) NOT NULL);
CREATE TABLE sales (saleid integer NOT NULL, productid integer NOT NULL, qty integer NOT NULL);
CREATE VIEW product_sales AS
SELECT p.productid,
p.code,
s.qty
FROM (products p
LEFT JOIN ( SELECT sales.productid,
sum(sales.qty) AS qty
FROM sales
GROUP BY sales.productid) s ON ((p.productid = s.productid)));
If a user does:
SELECT productid,code FROM product_sales;
Then, if I'm correct, the join on sales can be removed.
Attached is a patch which implements this. It's still a bit rough around the edges and some names could likely do with being improved, but it at least seems to work with all of the test cases that I've thrown at it so far.
Comments are welcome, but the main purpose of the email is so I can register the patch for the June commitfest.