Re: Extending constraint exclusion for implied constraints/conditions - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Extending constraint exclusion for implied constraints/conditions |
Date | |
Msg-id | CAFjFpRdK27_D9_ahPfLVpLjog1njr+eKix_23sQaVALDXa7G4w@mail.gmail.com Whole thread Raw |
In response to | Re: Extending constraint exclusion for implied constraints/conditions (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Extending constraint exclusion for implied constraints/conditions
|
List | pgsql-hackers |
On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:I doubt it. The extra code isn't the problem so much, it's the extra
> Right now, constraint exclusion code looks only at the provided conditions.
> If we want avoid table scan based on constraints in the above example, it
> will need to look at the implied conditions as well. E.g. val2 < 30 AND val
> = val2 => val < 30. Then the constraint exclusion can see that val < 30 AND
> val > 30 are contradictory and infer that the result is going to be empty.
> We will need to add information about the transitive inferences between
> operators. Can we do that in PostgreSQL? Will the benefits be worth the
> code, that needs to be added?
planning cycles spent trying to make proofs that will usually fail.
What you propose will create a combinatorial explosion in the number
of proof paths to be considered.
I can understand that it will create combinatorial explosion in the number of predicates that need to be examined by the constraint exclusion. I do not understand where come the paths gets involved here. The constraint exclusion kicks in before paths are created
220 static void
221 set_rel_size(PlannerInfo *root, RelOptInfo *rel,
222 Index rti, RangeTblEntry *rte)
223 {
224 if (rel->reloptkind == RELOPT_BASEREL &&
225 relation_excluded_by_constraints(root, rel, rte))
226 {
227 /*
228 * We proved we don't need to scan the rel via constraint exclusion,
229 * so set up a single dummy path for it. Here we only check this for
230 * regular baserels; if it's an otherrel, CE was already checked in
231 * set_append_rel_pathlist().
232 *
233 * In this case, we go ahead and set up the relation's path right away
234 * instead of leaving it for set_rel_pathlist to do. This is because
235 * we don't have a convention for marking a rel as dummy except by
236 * assigning a dummy path to it.
237 */
238 set_dummy_rel_pathlist(rel);
239 }
220 static void
221 set_rel_size(PlannerInfo *root, RelOptInfo *rel,
222 Index rti, RangeTblEntry *rte)
223 {
224 if (rel->reloptkind == RELOPT_BASEREL &&
225 relation_excluded_by_constraints(root, rel, rte))
226 {
227 /*
228 * We proved we don't need to scan the rel via constraint exclusion,
229 * so set up a single dummy path for it. Here we only check this for
230 * regular baserels; if it's an otherrel, CE was already checked in
231 * set_append_rel_pathlist().
232 *
233 * In this case, we go ahead and set up the relation's path right away
234 * instead of leaving it for set_rel_pathlist to do. This is because
235 * we don't have a convention for marking a rel as dummy except by
236 * assigning a dummy path to it.
237 */
238 set_dummy_rel_pathlist(rel);
239 }
and does not create paths for relations excluded by constraints
295 /*
296 * set_rel_pathlist
297 * Build access paths for a base relation
298 */
299 static void
300 set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
301 Index rti, RangeTblEntry *rte)
302 {
303 if (IS_DUMMY_REL(rel))
304 {
305 /* We already proved the relation empty, so nothing more to do */
306 }
295 /*
296 * set_rel_pathlist
297 * Build access paths for a base relation
298 */
299 static void
300 set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
301 Index rti, RangeTblEntry *rte)
302 {
303 if (IS_DUMMY_REL(rel))
304 {
305 /* We already proved the relation empty, so nothing more to do */
306 }
Same in the case of join in mark_join_rel()
663 * JOIN_INNER.
664 */
665 switch (sjinfo->jointype)
666 {
667 case JOIN_INNER:
668 if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
669 restriction_is_constant_false(restrictlist, false))
670 {
671 mark_dummy_rel(joinrel);
672 break;
673 }
674 add_paths_to_joinrel(root, joinrel, rel1, rel2,
675 JOIN_INNER, sjinfo,
676 restrictlist);
677 add_paths_to_joinrel(root, joinrel, rel2, rel1,
678 JOIN_INNER, sjinfo,
679 restrictlist);
680 break;
663 * JOIN_INNER.
664 */
665 switch (sjinfo->jointype)
666 {
667 case JOIN_INNER:
668 if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
669 restriction_is_constant_false(restrictlist, false))
670 {
671 mark_dummy_rel(joinrel);
672 break;
673 }
674 add_paths_to_joinrel(root, joinrel, rel1, rel2,
675 JOIN_INNER, sjinfo,
676 restrictlist);
677 add_paths_to_joinrel(root, joinrel, rel2, rel1,
678 JOIN_INNER, sjinfo,
679 restrictlist);
680 break;
BTW, on a side note, I noticed, we use mark_dummy_rel() for joinrels and set_dummy_rel_pathlist() for baserel. Shouldn't we be using the same function everywhere for the same functionality (e.g. adding an append path with no child-paths).
For larger relations, the time spent in constraint exclusion might be lesser than the time taken by actual table/index scan and that to when such a scan is not going to produce any rows. So, we might want to apply the technique only when the estimated number of rows/pages are above a certain threshold and may be when the GUC is ON (like we do it today).
... and applying constraint exclusion to join relations would make that
> I can see some more benefits. We can use it to eliminate joins where the
> constraints on joining relations may cause the join to be empty e.g.
problem even worse.
The same case goes with joins, where potentially, the crossproduct of two tables is huge.
regards, tom lane
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
pgsql-hackers by date: