Thread: match_unsorted_outer() vs. cost_nestloop()

match_unsorted_outer() vs. cost_nestloop()

From

Robert Haas

Date:

04 September 2009, 22:02:21

In joinpath.c, match_unsorted_outer() considers materializing the
inner side of each nested loop if the inner path is not an index scan,
bitmap heap scan, tid scan, material path, function scan, CTE scan, or
worktable scan.  In costsize.c, cost_nestloop() charges the startup
cost only once if the inner path is a hash path or material path;
otherwise, it charges it for every anticipated rescan.

It seems to me, perhaps naively, like the criteria used in these two
places are more different than they maybe should be.  For example,
function scan nodes insert their results into a tuplestore so that
rescans get the same set of tuples, which is why we don't consider
inserting a materialize node over them in match_unsorted_outer() - but
I think that also means that we oughtn't to be counting the startup
cost for every rescan.

I'm not exactly sure which ones should match or not match.  Hash
paths, maybe, shouldn't.  I believe the reason why we don't count the
startup cost of the hash path over again is because we're assuming
that it's attributable to the cost of building the hash table, which
only needs to be done once.  I don't think that's 100% accurate
because the hash path could have inherited some of that cost from its
underlying paths.  At any rate, it's conceivable that materializing
could be enough cheaper than repeating the join that a materialize
nodes makes sense.

Thoughts?

...Robert

Re: match_unsorted_outer() vs. cost_nestloop()

From

Tom Lane

Date:

04 September 2009, 22:55:03

Robert Haas <robertmhaas@gmail.com> writes:
> In joinpath.c, match_unsorted_outer() considers materializing the
> inner side of each nested loop if the inner path is not an index scan,
> bitmap heap scan, tid scan, material path, function scan, CTE scan, or
> worktable scan.  In costsize.c, cost_nestloop() charges the startup
> cost only once if the inner path is a hash path or material path;
> otherwise, it charges it for every anticipated rescan.

> It seems to me, perhaps naively, like the criteria used in these two
> places are more different than they maybe should be.

They are considering totally different effects, so I'm not sure I
follow that conclusion.

I'll certainly concede that the costing of materialize plans is rather
bogus --- it's been a long time since materialize behaved the way
cost_material thinks it does (ie, read the whole input before handing
anything back).  But our cost model doesn't have a way to represent the
true value of a materialize node, which is that a re-read is a lot
cheaper than the original fetch.  I've occasionally tried to think of a
way to deal with that without introducing a lot of extra calculations
and complexity everywhere else ...
        regards, tom lane

Re: match_unsorted_outer() vs. cost_nestloop()

From

Robert Haas

Date:

05 September 2009, 00:13:34

On Fri, Sep 4, 2009 at 9:54 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> In joinpath.c, match_unsorted_outer() considers materializing the
>> inner side of each nested loop if the inner path is not an index scan,
>> bitmap heap scan, tid scan, material path, function scan, CTE scan, or
>> worktable scan.  In costsize.c, cost_nestloop() charges the startup
>> cost only once if the inner path is a hash path or material path;
>> otherwise, it charges it for every anticipated rescan.
>
>> It seems to me, perhaps naively, like the criteria used in these two
>> places are more different than they maybe should be.
>
> They are considering totally different effects, so I'm not sure I
> follow that conclusion.
>
> I'll certainly concede that the costing of materialize plans is rather
> bogus --- it's been a long time since materialize behaved the way
> cost_material thinks it does (ie, read the whole input before handing
> anything back).  But our cost model doesn't have a way to represent the
> true value of a materialize node, which is that a re-read is a lot
> cheaper than the original fetch.  I've occasionally tried to think of a
> way to deal with that without introducing a lot of extra calculations
> and complexity everywhere else ...

I noticed that, too, and I don't know what to do about it either.

I guess my point is that for node types that dump their output into a
tuplestore anyway, it doesn't seem like cost_nestloop() should charge
n * the startup cost.  I believe that at least function, CTE, and
worktable scans fall into this category.  No?

...Robert

Re: match_unsorted_outer() vs. cost_nestloop()

From

Tom Lane

Date:

05 September 2009, 21:24:49

Robert Haas <robertmhaas@gmail.com> writes:
> I guess my point is that for node types that dump their output into a
> tuplestore anyway, it doesn't seem like cost_nestloop() should charge
> n * the startup cost.  I believe that at least function, CTE, and
> worktable scans fall into this category.  No?

Yeah, probably.  The comment is correct as is:
    * their sum.  What's not so clear is whether the inner path's    * startup_cost must be paid again on each rescan
ofthe inner path. This    * is not true if the inner path is materialized or is a hashjoin, but    * probably is true
otherwise.

What's not correct is the code's expansion of "is materialized" as
"is a MaterialPath".  However, I'm not sure it's worth just adding
these other tuplestore-using types to the list.  We really ought
to think a bit harder about representing the difference between
initial scan cost and rescan cost.

It might be sufficient to have cost_nestloop just hardwire the knowledge
that certain inner path types have a different behavior here --- that
is, for a rescan there is zero start cost and some very low per-tuple
cost, independent of the path's nominal cost values (which would now
be defined as always the costs for the first scan).  And maybe the same
in cost_mergejoin.  Offhand I don't think anyplace else really needs to
think about rescan costs.

I think this would be enough to deal with the issue for those plan types
that materialize their output, because they all have about the same
runtime behavior in this regard.  What gets more exciting is if you'd
like to model other effects this way --- for example, the one that
rescanning an indexscan is probably a lot cheaper than the original
fetch because of caching effects.  But we already have that sort of
thing accounted for (to some extent anyway) elsewhere, so I think we
can probably ignore it here.
        regards, tom lane

Re: match_unsorted_outer() vs. cost_nestloop()

From

Tom Lane

Date:

05 September 2009, 21:39:28

I wrote:
> It might be sufficient to have cost_nestloop just hardwire the knowledge
> that certain inner path types have a different behavior here --- that
> is, for a rescan there is zero start cost and some very low per-tuple
> cost, independent of the path's nominal cost values (which would now
> be defined as always the costs for the first scan).  And maybe the same
> in cost_mergejoin.  Offhand I don't think anyplace else really needs to
> think about rescan costs.

After thinking about that a bit more, I think the best way might be
to create a "cost_rescan" function that is given a Path and returns
the startup cost and total cost to be assumed for a rescan of this Path.
It would know about the special behavior of MaterialPath and the other
tuplestore-using plan types, and for everything else would just return
the path's regular costs.

Alternatively we could create a cost_foo_rescan() function paralleling
each cost_foo() function, but given the small number of distinct
behaviors I think that would be fairly redundant and hard to maintain.
        regards, tom lane

Re: match_unsorted_outer() vs. cost_nestloop()

From

Robert Haas

Date:

05 September 2009, 23:57:17

On Sat, Sep 5, 2009 at 8:39 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> It might be sufficient to have cost_nestloop just hardwire the knowledge
>> that certain inner path types have a different behavior here --- that
>> is, for a rescan there is zero start cost and some very low per-tuple
>> cost, independent of the path's nominal cost values (which would now
>> be defined as always the costs for the first scan).  And maybe the same
>> in cost_mergejoin.  Offhand I don't think anyplace else really needs to
>> think about rescan costs.
>
> After thinking about that a bit more, I think the best way might be
> to create a "cost_rescan" function that is given a Path and returns
> the startup cost and total cost to be assumed for a rescan of this Path.
> It would know about the special behavior of MaterialPath and the other
> tuplestore-using plan types, and for everything else would just return
> the path's regular costs.
>
> Alternatively we could create a cost_foo_rescan() function paralleling
> each cost_foo() function, but given the small number of distinct
> behaviors I think that would be fairly redundant and hard to maintain.

That sounds very reasonable.

There's another sort of interesting point here, too, relating to
dealing with imperfect statistics.  When costing a nest-join, the
number of times we expect to rescan the inner side is equal to the
estimated number of outer rows.  If that estimate is > 1,
materialization will often look like a winner, provided that it's not
estimated to overflow work_mem, because the we'll say, oh, look all
these rescans get much cheaper and all it costs us is a little
bookkeeping overhead.  But if the estimate is <= 1, materialization
will certainly look like a loser, because there are no rescans to make
cheaper and we still have the bookkeeping overhead.

But let's say the estimate is off.  If the real number of outer rows
turns out to be 0, then materialization costs nothing, because we'll
never evaluate the inner side at all.  And if it turns out to be 2 or
more, then materialization probably wins, as long as it doesn't spill
to disk.  Only if the number of outer rows turns out to be exactly 1
does materialization lose - and even then it's pretty cheap, unless,
of course, it spills to disk.  So maybe we should consider FORCING
materialization to be used in certain cases, something like this:

1. If the inner path is something that's relatively cheap to rescan
(already materialized or an index scan), then try only the
non-materialized path.  This is what we already do now.

2. Otherwise, estimate the memory cost of materializing the inner
side.  If it seems that it will fit within work_mem, then try the
materialized path ONLY, on the assumption that there's far more upside
than downside.

3. Otherwise, try both plans (?).  Likely both of them are going to be
pretty bad...

It's also worth noting that the comment here doesn't really match the
code.  "If the cheapest inner path is a join or seqscan, we should
consider materializing it."  But that's not what the code actually
tests - instead, it checks that the cheapest inner path is NOT an
index scan, bitmap heap scan, tidscan, material path, function scan,
CTE scan, or worktable scan.  Maybe that works out to the same thing,
but it's not obvious.

And, by the way, is the algorithm proposed in the comment sensible
anyway?  Under what circumstances would it make sense to materialize a
sequential scan?  It seems to me that if the relation is small enough
to fit in memory, then materializing it won't save much - if it's not,
then materialization is going to be slow, too.  I guess if you have
work_mem very large relative to shared buffers, then you might get the
planner to think that it can cache it with materialization but not
without materialization, but I don't that the real world can ever work
that way.  I don't think I've ever seen the planner choose to
materialize a seqscan, but it might become much more common with this
change if we're not careful, because while cost_index() contains some
logic to take into effect the caching effects associated with
rescanning and index, cost_seqscan() does not.

...Robert

Re: match_unsorted_outer() vs. cost_nestloop()

From

Tom Lane

Date:

06 September 2009, 11:45:21

Robert Haas <robertmhaas@gmail.com> writes:
> And, by the way, is the algorithm proposed in the comment sensible
> anyway?  Under what circumstances would it make sense to materialize a
> sequential scan?

Expensive filter conditions, for example.

I've occasionally wondered if this code isn't outright wrong anyway:
when you consider the costs of checking tuple visibility and the costs
involved in access to a shared buffer, it's possible that copying tuples
to a local materialization store would be a win for rescans in any case.
(Of course it's a lot easier to credit that concept when the store
doesn't spill to disk.)  Given the basic bogosity of the costing rules
I wasn't eager to mess with it; in fact I think we deliberately tweaked
things in this area to prevent materialization, because otherwise the
planner *always* wanted to materialize and that didn't seem to be a win.
But now that we have a plan for a less obviously broken costing
approach, maybe we should open the floodgates and allow materialization
to be considered for any inner path that doesn't materialize itself
already.
        regards, tom lane

Re: match_unsorted_outer() vs. cost_nestloop()

From

Robert Haas

Date:

06 September 2009, 12:42:18

On Sep 6, 2009, at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
>> And, by the way, is the algorithm proposed in the comment sensible
>> anyway?  Under what circumstances would it make sense to  
>> materialize a
>> sequential scan?
>
> Expensive filter conditions, for example.

Ah, right.  Yeah that could be a big win.

> I've occasionally wondered if this code isn't outright wrong anyway:
> when you consider the costs of checking tuple visibility and the costs
> involved in access to a shared buffer, it's possible that copying  
> tuples
> to a local materialization store would be a win for rescans in any  
> case.
> (Of course it's a lot easier to credit that concept when the store
> doesn't spill to disk.)  Given the basic bogosity of the costing rules
> I wasn't eager to mess with it; in fact I think we deliberately  
> tweaked
> things in this area to prevent materialization, because otherwise the
> planner *always* wanted to materialize and that didn't seem to be a  
> win.
> But now that we have a plan for a less obviously broken costing
> approach, maybe we should open the floodgates and allow  
> materialization
> to be considered for any inner path that doesn't materialize itself
> already

Maybe.  I think some experimentation will be required.  We also have  
to be aware of effects on planning time; match_unsorted_outer() is,  
AIR, a significant part of the CPU cost of planning large join problems.

...Robert

Re: match_unsorted_outer() vs. cost_nestloop()

From

Tom Lane

Date:

12 September 2009, 19:14:50

Robert Haas <robertmhaas@gmail.com> writes:
> On Sep 6, 2009, at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... But now that we have a plan for a less obviously broken costing
>> approach, maybe we should open the floodgates and allow  
>> materialization
>> to be considered for any inner path that doesn't materialize itself
>> already

> Maybe.  I think some experimentation will be required.  We also have  
> to be aware of effects on planning time; match_unsorted_outer() is,  
> AIR, a significant part of the CPU cost of planning large join problems.

I've committed some changes pursuant to this discussion.  It may be that
match_unsorted_outer gets a bit slower, but I'm not too worried about
that.  My experience is that the code that tries different mergejoin
options eats way more cycles than the nestloop code does.
        regards, tom lane