Home > mailing lists

Re: Extending constraint exclusion for implied constraints/conditions - Mailing list pgsql-hackers

From	Ashutosh Bapat
Subject	Re: Extending constraint exclusion for implied constraints/conditions
Date	July 8, 2014 07:54:33
Msg-id	CAFjFpRdK27_D9_ahPfLVpLjog1njr+eKix_23sQaVALDXa7G4w@mail.gmail.com Whole thread Raw
In response to	Re: Extending constraint exclusion for implied constraints/conditions (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Extending constraint exclusion for implied constraints/conditions
List	pgsql-hackers

Tree view

On Mon, Jul 7, 2014 at 7:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
> Right now, constraint exclusion code looks only at the provided conditions.
> If we want avoid table scan based on constraints in the above example, it
> will need to look at the implied conditions as well. E.g. val2 < 30 AND val
> = val2 => val < 30. Then the constraint exclusion can see that val < 30 AND
> val > 30 are contradictory and infer that the result is going to be empty.
> We will need to add information about the transitive inferences between
> operators. Can we do that in PostgreSQL? Will the benefits be worth the
> code, that needs to be added?

I doubt it. The extra code isn't the problem so much, it's the extra
planning cycles spent trying to make proofs that will usually fail.
What you propose will create a combinatorial explosion in the number
of proof paths to be considered.

I can understand that it will create combinatorial explosion in the number of predicates that need to be examined by the constraint exclusion. I do not understand where come the paths gets involved here. The constraint exclusion kicks in before paths are created

220 static void
221 set_rel_size(PlannerInfo *root, RelOptInfo *rel,
222              Index rti, RangeTblEntry *rte)
223 {
224     if (rel->reloptkind == RELOPT_BASEREL &&
225         relation_excluded_by_constraints(root, rel, rte))
226     {
227         /*
228          * We proved we don't need to scan the rel via constraint exclusion,
229          * so set up a single dummy path for it. Here we only check this for
230          * regular baserels; if it's an otherrel, CE was already checked in
231          * set_append_rel_pathlist().
232          *
233          * In this case, we go ahead and set up the relation's path right away
234          * instead of leaving it for set_rel_pathlist to do. This is because
235          * we don't have a convention for marking a rel as dummy except by
236          * assigning a dummy path to it.
237          */
238         set_dummy_rel_pathlist(rel);
239     }

and does not create paths for relations excluded by constraints

295 /*
296 * set_rel_pathlist
297 *    Build access paths for a base relation
298 */
299 static void
300 set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
301                  Index rti, RangeTblEntry *rte)
302 {
303     if (IS_DUMMY_REL(rel))
304     {
305         /* We already proved the relation empty, so nothing more to do */
306     }

Same in the case of join in mark_join_rel()

663      * JOIN_INNER.
664      */
665     switch (sjinfo->jointype)
666     {
667         case JOIN_INNER:
668             if (is_dummy_rel(rel1) || is_dummy_rel(rel2) ||
669                 restriction_is_constant_false(restrictlist, false))
670             {
671                 mark_dummy_rel(joinrel);
672                 break;
673             }
674             add_paths_to_joinrel(root, joinrel, rel1, rel2,
675                                  JOIN_INNER, sjinfo,
676                                  restrictlist);
677             add_paths_to_joinrel(root, joinrel, rel2, rel1,
678                                  JOIN_INNER, sjinfo,
679                                  restrictlist);
680             break;

BTW, on a side note, I noticed, we use mark_dummy_rel() for joinrels and set_dummy_rel_pathlist() for baserel. Shouldn't we be using the same function everywhere for the same functionality (e.g. adding an append path with no child-paths).

For larger relations, the time spent in constraint exclusion might be lesser than the time taken by actual table/index scan and that to when such a scan is not going to produce any rows. So, we might want to apply the technique only when the estimated number of rows/pages are above a certain threshold and may be when the GUC is ON (like we do it today).

> I can see some more benefits. We can use it to eliminate joins where the
> constraints on joining relations may cause the join to be empty e.g.

... and applying constraint exclusion to join relations would make that
problem even worse.

The same case goes with joins, where potentially, the crossproduct of two tables is huge.

regards, tom lane

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

From: Craig Ringer
Date: 08 July 2014, 07:50:09
Subject: Re: [PATCH] Improve bytea error messages

From: Ashoke
Date: 08 July 2014, 08:22:08
Subject: Modifying update_attstats of analyze.c for C Strings

Re: Extending constraint exclusion for implied constraints/conditions - Mailing list pgsql-hackers

Previous

Next