I'm currently in the early stages of looking into expanding join removals.
Currently left outer joins can be removed if none of the columns of the table are required for anything and the table being joined is a base table that contains a unique index on all columns in the join clause.
The case I would like to work on is to allow sub queries where the query is grouped by or distinct on all of the join columns.
Take the following as an example:
CREATE TABLE products (productid integer NOT NULL, code character varying(32) NOT NULL);
CREATE TABLE sales (saleid integer NOT NULL, productid integer NOT NULL, qty integer NOT NULL);
CREATE VIEW product_sales AS
SELECT p.productid,
p.code,
s.qty
FROM (products p
LEFT JOIN ( SELECT sales.productid,
sum(sales.qty) AS qty
FROM sales
GROUP BY sales.productid) s ON ((p.productid = s.productid)));
If a user does:
SELECT productid,code FROM product_sales;
Then, if I'm correct, the join on sales can be removed.
As I said above, I'm in the early stages of looking at this and I'm currently a bit confused. Basically I've put a breakpoint at the top of the join_is_removable function and I can see that innerrel->rtekind is RTE_SUBQUERY for my query, so the function is going to return false. So what I need to so is get access to innerrel->subroot->parse so that I can look at groupClause and distinctClause. The thing is innerrel->subroot is NULL in this case, but I see a comment for subroot saying "subroot - PlannerInfo for subquery (NULL if it's not a subquery)" so I guess this does not also mean "subroot - PlannerInfo for subquery (NOT NULL if it's a subquery)"?
Has anyone got any pointers to where I might be able to get the Query details for the subquery? These structures are quite new to me.
Regards
David Rowley