Re: match_unsorted_outer() vs. cost_nestloop() - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: match_unsorted_outer() vs. cost_nestloop() |
Date | |
Msg-id | 603c8f070909051929ubdebb27nfde1ea02453c3b79@mail.gmail.com Whole thread Raw |
In response to | Re: match_unsorted_outer() vs. cost_nestloop() (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: match_unsorted_outer() vs. cost_nestloop()
|
List | pgsql-hackers |
On Sat, Sep 5, 2009 at 8:39 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > I wrote: >> It might be sufficient to have cost_nestloop just hardwire the knowledge >> that certain inner path types have a different behavior here --- that >> is, for a rescan there is zero start cost and some very low per-tuple >> cost, independent of the path's nominal cost values (which would now >> be defined as always the costs for the first scan). And maybe the same >> in cost_mergejoin. Offhand I don't think anyplace else really needs to >> think about rescan costs. > > After thinking about that a bit more, I think the best way might be > to create a "cost_rescan" function that is given a Path and returns > the startup cost and total cost to be assumed for a rescan of this Path. > It would know about the special behavior of MaterialPath and the other > tuplestore-using plan types, and for everything else would just return > the path's regular costs. > > Alternatively we could create a cost_foo_rescan() function paralleling > each cost_foo() function, but given the small number of distinct > behaviors I think that would be fairly redundant and hard to maintain. That sounds very reasonable. There's another sort of interesting point here, too, relating to dealing with imperfect statistics. When costing a nest-join, the number of times we expect to rescan the inner side is equal to the estimated number of outer rows. If that estimate is > 1, materialization will often look like a winner, provided that it's not estimated to overflow work_mem, because the we'll say, oh, look all these rescans get much cheaper and all it costs us is a little bookkeeping overhead. But if the estimate is <= 1, materialization will certainly look like a loser, because there are no rescans to make cheaper and we still have the bookkeeping overhead. But let's say the estimate is off. If the real number of outer rows turns out to be 0, then materialization costs nothing, because we'll never evaluate the inner side at all. And if it turns out to be 2 or more, then materialization probably wins, as long as it doesn't spill to disk. Only if the number of outer rows turns out to be exactly 1 does materialization lose - and even then it's pretty cheap, unless, of course, it spills to disk. So maybe we should consider FORCING materialization to be used in certain cases, something like this: 1. If the inner path is something that's relatively cheap to rescan (already materialized or an index scan), then try only the non-materialized path. This is what we already do now. 2. Otherwise, estimate the memory cost of materializing the inner side. If it seems that it will fit within work_mem, then try the materialized path ONLY, on the assumption that there's far more upside than downside. 3. Otherwise, try both plans (?). Likely both of them are going to be pretty bad... It's also worth noting that the comment here doesn't really match the code. "If the cheapest inner path is a join or seqscan, we should consider materializing it." But that's not what the code actually tests - instead, it checks that the cheapest inner path is NOT an index scan, bitmap heap scan, tidscan, material path, function scan, CTE scan, or worktable scan. Maybe that works out to the same thing, but it's not obvious. And, by the way, is the algorithm proposed in the comment sensible anyway? Under what circumstances would it make sense to materialize a sequential scan? It seems to me that if the relation is small enough to fit in memory, then materializing it won't save much - if it's not, then materialization is going to be slow, too. I guess if you have work_mem very large relative to shared buffers, then you might get the planner to think that it can cache it with materialization but not without materialization, but I don't that the real world can ever work that way. I don't think I've ever seen the planner choose to materialize a seqscan, but it might become much more common with this change if we're not careful, because while cost_index() contains some logic to take into effect the caching effects associated with rescanning and index, cost_seqscan() does not. ...Robert
pgsql-hackers by date: