Re: multi-layered view join performance oddities - Mailing list pgsql-performance

From Svenne Krap
Subject Re: multi-layered view join performance oddities
Date
Msg-id 436515A6.8040909@krap.dk
Whole thread Raw
In response to Re: multi-layered view join performance oddities  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
Svenne Krap <svenne@krap.dk> writes: 
create view ord_institutes_sum asSELECT ord_property_type_all.dataset_id, ord_property_type_all.nb_property_type_id, 0 AS institut, sum(ord_property_type_all.amount) AS amount  FROM ord_property_type_allGROUP BY ord_property_type_all.dataset_id, ord_property_type_all.nb_property_type_id;   
 
create view ord_result_pct asSELECT t1.dataset_id, t1.nb_property_type_id, t1.institut, t1.amount / t2.amount * 100::numeric AS pct  FROM ord_property_type_all t1, ord_institutes_sum t2 WHERE t1.dataset_id = t2.dataset_id AND t1.nb_property_type_id = t2.nb_property_type_id;   
This is really pretty horrid code: you're requesting double evaluation
of the ord_property_type_all view, and then joining the two calculations
to each other.  No, the planner will not detect how silly this is :-(,
nor will it realize that there's guaranteed to be a match for every row
--- I believe the latter is the reason for the serious misestimation
that Steinar noted.  The misestimation doesn't hurt particularly when
evaluating ord_result_pct by itself, because there are no higher-level
decisions to make ... but it hurts a lot when you join ord_result_pct to
some other stuff. 
I don't really see, how this query is horrid from a user perspective, this is exactly the way, the percentage has to be calculated from a "philosophical" standpoint (performance considerations left out).
This is very bad news for me, as most of the other (much larger) queries have the same issue, that the views will be used multiple times got get slightly different data, that has to be joined (also more than 2 times as in this case)

I think, it has to run multiple times as it returns two different types of data.

It seems like there must be a way to get the percentage amounts with
only one evaluation of ord_property_type_all, but I'm not seeing it
right offhand. 

I will think about how to remove the second evaluation of the view in question, if anyone knows how, a hint is very appriciated :)

I could of course go the "materialized view" way, but would really prefer not to.

Svenne

pgsql-performance by date:

Previous
From: Svenne Krap
Date:
Subject: Re: multi-layered view join performance oddities
Next
From: Martin Lesser
Date:
Subject: Re: Effects of cascading references in foreign keys