I changed a query that used a subselect to user a more efficient subselect. Here are the
explains for the old way and the new way:
OLD WAY:
Unique (cost=47980.80..49226.66 rows=1780 width=789)
-> Sort (cost=47980.80..47980.80 rows=17798 width=789)
-> Hash Join (cost=18737.75..32994.32 rows=17798 width=789)
-> Seq Scan on u (cost=0.00..7211.12 rows=17798 width=627)
-> Hash (cost=7432.70..7432.70 rows=162170 width=162)
-> Seq Scan on d (cost=0.00..7432.70 rows=162170 width=162)
SubPlan
-> Nested Loop (cost=0.00..11.19 rows=1 width=82)
-> Index Scan using a_pkey on a (cost=0.00..5.86 rows=1 width=4)
-> Index Scan using pu_pkey on pu (cost=0.00..5.31 rows=1 width=78)
-> Limit (cost=17.51..17.51 rows=1 width=10)
-> Sort (cost=17.51..17.51 rows=4 width=10)
-> Index Scan using ml_u_and_p_key on ml (cost=0.00..17.46
rows=4 width=10)
NEW WAY:
Unique (cost=48031.59..49277.45 rows=1780 width=793)
-> Sort (cost=48031.59..48031.59 rows=17798 width=793)
-> Hash Join (cost=18680.53..32955.10 rows=17798 width=793)
-> Seq Scan on u (cost=0.00..7211.12 rows=17798 width=631)
-> Hash (cost=7432.70..7432.70 rows=162170 width=162)
-> Seq Scan on d (cost=0.00..7432.70 rows=162170 width=162)
SubPlan
-> Nested Loop (cost=0.00..11.19 rows=1 width=82)
-> Index Scan using a_pkey on a (cost=0.00..5.86 rows=1 width=4)
-> Index Scan using pu_pkey on pu (cost=0.00..5.31 rows=1 width=78)
-> Index Scan using ml_u_and_p_key on ml (cost=0.00..3.02 rows=1 width=2)
Note that the cost of the scan using ml_u_and_p_key went from 17.46 to 3.02, but the total
cost of the query went up. What's going on?
Also does explain add the cost of the subselect for each time it is executed? It seems to
add the cost as if it is executed only once. In the first I did (SELECT field FROM ml
where u.u = ml.u order by field2 desc limit 1). In the second I added a field to u that
refrences the record in ml, and the query looks like (SELECT field FROM ml where u.ref =
ml.id)