Home > mailing lists

match_unsorted_outer() vs. cost_nestloop() - Mailing list pgsql-hackers

From	Robert Haas
Subject	match_unsorted_outer() vs. cost_nestloop()
Date	September 5, 2009 01:02:21
Msg-id	603c8f070909041802p18ed2fb1v91245ccfb5c2a24a@mail.gmail.com Whole thread Raw
Responses	Re: match_unsorted_outer() vs. cost_nestloop()
List	pgsql-hackers

Tree view

In joinpath.c, match_unsorted_outer() considers materializing the
inner side of each nested loop if the inner path is not an index scan,
bitmap heap scan, tid scan, material path, function scan, CTE scan, or
worktable scan.  In costsize.c, cost_nestloop() charges the startup
cost only once if the inner path is a hash path or material path;
otherwise, it charges it for every anticipated rescan.

It seems to me, perhaps naively, like the criteria used in these two
places are more different than they maybe should be.  For example,
function scan nodes insert their results into a tuplestore so that
rescans get the same set of tuples, which is why we don't consider
inserting a materialize node over them in match_unsorted_outer() - but
I think that also means that we oughtn't to be counting the startup
cost for every rescan.

I'm not exactly sure which ones should match or not match.  Hash
paths, maybe, shouldn't.  I believe the reason why we don't count the
startup cost of the hash path over again is because we're assuming
that it's attributable to the cost of building the hash table, which
only needs to be done once.  I don't think that's 100% accurate
because the hash path could have inherited some of that cost from its
underlying paths.  At any rate, it's conceivable that materializing
could be enough cheaper than repeating the join that a materialize
nodes makes sense.

Thoughts?

...Robert

pgsql-hackers by date:

From: Tom Lane
Date: 04 September 2009, 23:41:32
Subject: Re: Non-Solaris dtrace support is disabled in 8.4!!!?

From: Robert Haas
Date: 05 September 2009, 01:28:30
Subject: Re: Eliminating VACUUM FULL WAS: remove flatfiles.c

match_unsorted_outer() vs. cost_nestloop() - Mailing list pgsql-hackers

Previous

Next