> On May 1, 2025, at 16:33, Richard Guo <guofenglinux@gmail.com> wrote:
>
> Here is the patchset that implements this optimization. 0001 moves
> the expansion of virtual generated columns to occur before sublink
> pull-up. 0002 introduces a new function, preprocess_relation_rtes,
> which scans the rangetable for relation RTEs and performs inh flag
> updates and virtual generated column expansion in a single loop, so
> that only one table_open/table_close call is required for each
> relation. 0003 collects NOT NULL attribute information for each
> relation within the same loop, stores it in a relation OID based hash
> table, and uses this information to reduce NullTest quals during
> constant folding.
>
> I think the code now more closely resembles the phase 1 and phase 2
> described earlier: it collects all required early-stage catalog
> information within a single loop over the rangetable, allowing each
> relation to be opened and closed only once. It also avoids the
> has_subclass() call along the way.
>
> Thanks
> Richard
>
<v4-0001-Expand-virtual-generated-columns-before-sublink-p.patch><v4-0002-Centralize-collection-of-catalog-info-needed-earl.patch><v4-0003-Reduce-Var-IS-NOT-NULL-quals-during-constant-fold.patch>
Hi,
I've been following the V4 patches (focusing on 1 and 2 for now): Patch 2's preprocess_relation_rtes is a nice
improvementfor efficiently gathering early catalog info like inh and attgenerated definitions in one pass.
However, Patch 1 needs to add expansion calls inside specific pull-up functions (like convert_EXISTS_sublink_to_join)
becausethe main expansion work was moved before pull_up_sublinks.
Could we perhaps simplify this? What if preprocess_relation_rtes only collected the attgenerated definitions (storing
them,maybe in a hashtable like planned for attnotnull in Patch 3), but didn't perform the actual expansion (Var
replacement)?
Then, we could perform the actual expansion (Var replacement) in a separate, single, global step later on. Perhaps
afterpull_up_sublinks (closer to the original timing), or maybe even later still, for instance after
flatten_simple_union_all,once the main query structure including pulled-up subqueries/links has stabilized? A unified
expansionafter the major structural changes seems cleaner. I'm not sure where is the better position now.
This might avoid the need for the extra expansion calls within convert_EXISTS_sublink_to_join, etc., keeping the
informationgathering separate from the expression transformation and potentially making the overall flow a bit cleaner.
Any thoughts?
Thanks,
Chengpeng Yan