Extending constraint exclusion for implied constraints/conditions - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Extending constraint exclusion for implied constraints/conditions
Date
Msg-id CAFjFpRdZ3sQez36NBksEKFuzRWPyDjDrP221ejCdMOoKOZgGMQ@mail.gmail.com
Whole thread Raw
Responses Re: Extending constraint exclusion for implied constraints/conditions  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,
Here's a table I have
postgres=# \d+ tab1
                         Table "public.tab1"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 val    | integer |           | plain   |              |
 val2   | integer |           | plain   |              |
Check constraints:
    "tab1_val2_check" CHECK (val2 < 30)
    "tab1_val_check" CHECK (val > 30)

and I run following query on it,
postgres=# set constraint_exclusion to on;
SET
postgres=# explain verbose select * from tab1 where val = val2;
                         QUERY PLAN                         
-------------------------------------------------------------
 Seq Scan on public.tab1  (cost=0.00..36.75 rows=11 width=8)
   Output: val, val2
   Filter: (tab1.val = tab1.val2)
 
Given the constraints and the condition in WHERE clause, it can be inferred that the query will not return any row. However, PostgreSQL scans the table as seen in the plan.

Right now, constraint exclusion code looks only at the provided conditions. If we want avoid table scan based on constraints in the above example, it will need to look at the implied conditions as well. E.g. val2 < 30 AND val = val2 => val < 30. Then the constraint exclusion can see that val < 30 AND val > 30 are contradictory and infer that the result is going to be empty. We will need to add information about the transitive inferences between operators. Can we do that in PostgreSQL? Will the benefits be worth the code, that needs to be added?

I can see some more benefits. We can use it to eliminate joins where the constraints on joining relations may cause the join to be empty e.g.
postgres=# \d+ tab2
                         Table "public.tab2"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 val    | integer |           | plain   |              |
Check constraints:
    "tab2_val_check" CHECK (val < 30)

postgres=# \d+ tab1
                         Table "public.tab1"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 val    | integer |           | plain   |              |
Check constraints:
    "tab1_val_check" CHECK (val > 30)

and the query with join is -  select * from tab1, tab2 where tab1.val = tab2. val OR select * from tab1, tab2 where tab1.val > tab2.val.
 
This can be extended further for the case of join between two parent tables, where we will be eliminate joins between some pairs of children. There are limitations in this case though,
1. Benefits of this optimization will be visible in case of nested loop joins. Hash and Merge join implicitly eliminate the non-joining children.
2. We need a way to push join down append path, which PostgreSQL doesn't do today.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: 9.4 documentation: duplicate paragraph in logical decoding example
Next
From: Abhijit Menon-Sen
Date:
Subject: Re: 9.5 CF1