Re: Reduce "Var IS [NOT] NULL" quals during constant folding - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: Reduce "Var IS [NOT] NULL" quals during constant folding
Date
Msg-id a5f93486-fcc8-45a5-a62e-86051fdd7142@gmail.com
Whole thread Raw
In response to Re: Reduce "Var IS [NOT] NULL" quals during constant folding  (Richard Guo <guofenglinux@gmail.com>)
List pgsql-hackers
On 3/7/2025 02:30, Richard Guo wrote:
> On Wed, Jul 2, 2025 at 6:44 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> I apologise for the confusion in my previous message. I am not
>> suggesting that we postpone this. Instead, I would like an explanation
>> of why you believe that accessing the table statistics earlier could
>> negatively impact planner performance. As I mentioned before, I have
>> only envisioned rare instances where join eliminations may reduce the
>> number of relations and clause evaluations resulting in a constant.
> 
> I wonder how you arrived at the conclusion that these cases are rare.
> If they truly are, then why have we invested so much effort in
> optimizing for them?
There is no direct connection between effort and frequency; it primarily 
depends on personal desire. As you might find, much of the effort goes 
into convincing the community.
These specific cases should be rare from the Postgres perspective, the 
planner's code remains simple based on the assumption that crafting the 
appropriate query is the user's responsibility.

> 
> I also wonder why you think we should collect all catalog information
> at the very early stage of the planner, given that most of it is only
> used much later -- after RelOptInfos have been created.  If the goal
> is to avoid redundant catalog retrieval for the same relation in
> get_relation_info(), perhaps adding a caching mechanism within that
> function would be a more targeted solution.  I don't see a strong
> reason for moving get_relation_info() to the very beginning of the
> planner.
This indicates that there is still room for further exploration and 
discussion. For starters, the 'Redundant NullTest' issue is not the only 
concern. Additionally, Postgres processes pull-up transformation blindly 
without considering the cost model. However, each pull-up has its corner 
case, and in practice, we often see new complaints arise after a new 
pull-up technique is committed. One possible solution I envision could 
be to examine indexes and/or make raw initial estimations to avoid 
problematic pull-up cases.

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Inconsistent LSN format in pg_waldump output
Next
From: Andrei Lepikhov
Date:
Subject: Re: MergeJoin beats HashJoin in the case of multiple hash clauses