Re: planner missing a trick for foreign tables w/OR conditions - Mailing list pgsql-hackers

From Robert Haas
Subject Re: planner missing a trick for foreign tables w/OR conditions
Date
Msg-id CA+Tgmoa_bhRSVxAnu-rv1FK_0BtDB-GwDXwU6epzArFVpHABaA@mail.gmail.com
Whole thread Raw
In response to Re: planner missing a trick for foreign tables w/OR conditions  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: planner missing a trick for foreign tables w/OR conditions
Re: planner missing a trick for foreign tables w/OR conditions
List pgsql-hackers
On Tue, Dec 17, 2013 at 12:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I had been thinking it would fall down if there are several OR conditions
> affecting different collections of rels, but after going through the math
> again, I'm now thinking I was wrong and it does in fact work out.  As you
> say, we do depend on all paths generating the same rows, but since the
> extracted single-rel quals are inserted as plain baserestrictinfo quals,
> that'll be true.

OK.

> A bigger potential objection is that we're opening ourselves to larger
> problems with estimation failures due to correlated qual conditions, but
> again I'm finding that the math doesn't bear that out.  It's reasonable
> to assume that our estimate for the extracted qual will be better than
> our estimate for the OR as a whole, so our adjusted size estimates for
> the filtered base relations are probably OK.  And the adjustment to the
> OR clause selectivity means that the size estimate for the join comes
> out exactly the same.  We'll actually be better off than with what is
> likely to happen now, which is that people manually extract the simplified
> condition and insert it into the query explicitly.  PG doesn't realize
> that that's redundant and so will underestimate the join size.

I had not thought of that, but it seems like a valid point.

> So at this point I'm pretty much talked into it.  We could eliminate the
> dependence on indexes entirely, and replace this code with a step that
> simply tries to pull single-base-relation quals out of ORs wherever it can
> find one.  You could argue that the produced quals would sometimes not be
> worth testing for, but we could apply a heuristic that says to forget it
> unless the estimated selectivity of the extracted qual is less than,
> I dunno, 0.5 maybe.

This is an area where I think things are very different from local and
remote tables.  For a local table, the cost of transmitting a row from
one plan node to another is basically zero.  For a remote table, it's
potentially far higher, although unfortunately it's hard to know
exactly.  But if the qual is cheap to evaluate, and we're getting back
a lot of rows, I suspect even eliminating 5-10% of them could work out
to a win.  With a local table, 50% sounds like a reasonable number.

Another point to ponder is that there could be places where this
actually changes the plan significantly for the better.  Pre-filtering
one side of a join might, for example, reduce the amount of data on
one side to the point where a hash join is chosen over some other
strategy.  I don't know that this will actually help all that many
people but the best case is pretty dramatic for those it does help:
the partial qual might be almost as selective as the join condition
itself.

>  (I wonder if it'd be worth inserting a check that
> there's not already a manually-generated equivalent clause, too ...)

Sounds a little too clever IMHO.

> A very nice thing about this is we could do this step ahead of relation
> size estimate setting and thus remove the redundant work that currently
> happens in set_plain_rel_size when the optimization fires.  Which is
> another aspect of the current code that's a hack, so getting rid of it
> would be a net reduction in hackiness.

I'm not sure that would save anything measurable performance-wise, but
the hackiness reduction would be nice to have.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Extension Templates S03E11
Next
From: Josh Berkus
Date:
Subject: Re: Extension Templates S03E11