Re: post-freeze damage control - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: post-freeze damage control
Date
Msg-id 0c1cfa72-4fa4-4d98-a5e5-30c92e97ce63@postgrespro.ru
Whole thread Raw
In response to Re: post-freeze damage control  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: post-freeze damage control
List pgsql-hackers
On 9/4/2024 12:55, Tom Lane wrote:
> Andrei Lepikhov <a.lepikhov@postgrespro.ru> writes:
>>> * I really, really dislike jamming this logic into prepqual.c,
>>> where it has no business being.  I note that it was shoved
>>> into process_duplicate_ors without even the courtesy of
>>> expanding the header comment:
> 
>> Yeah, I preferred to do it in parse_expr.c with the assumption of some
>> 'minimal' or 'canonical' tree form.
> 
> That seems quite the wrong direction to me.  AFAICS, the argument
> for making this transformation depends on being able to convert
> to an indexscan condition, so I would try to apply it even later,
> when we have a set of restriction conditions to apply to a particular
> baserel.  (This would weaken the argument that we need hashing
> rather than naive equal() tests even further, I think.)  Applying
> the transform to join quals seems unlikely to be a win.
Our first prototype did this job right at the stage of index path 
creation. Unfortunately, this approach was too narrow and expensive.
The most problematic cases we encountered were from BitmapOr paths: if 
an incoming query has a significant number of OR clauses, the optimizer 
spends a lot of time generating these, in most cases, senseless paths 
(remember also memory allocated for that purpose). Imagine how much 
worse the situation becomes when we scale it with partitions.
Another issue we resolved with this transformation: shorter list of 
clauses speeds up planning and, sometimes, makes cardinality estimation 
more accurate.
Moreover, it helps even SeqScan: attempting to find a value in the 
hashed array is much faster than cycling a long-expression on each 
incoming tuple.

One more idea that I have set aside here is that the planner can utilize 
quick clause hashing:
 From time to time, in the mailing list, I see disputes on different 
approaches to expression transformation/simplification/grouping, and 
most of the time, it ends up with the problem of search complexity. 
Clause hash can be a way to solve this, can't it?

-- 
regards,
Andrei Lepikhov
Postgres Professional




pgsql-hackers by date:

Previous
From: "Zhijie Hou (Fujitsu)"
Date:
Subject: RE: Synchronizing slots from primary to standby
Next
From: Stefan Fercot
Date:
Subject: Re: post-freeze damage control