Hi Tom & David & Bapat:
Thanks for your review so far. I want to summarize the current issues to help
our following discussion.
1. Shall we bypass the AggNode as well with the same logic.
I think yes, since the rules to bypass a AggNode and UniqueNode is exactly same.
The difficulty of bypassing AggNode is the current aggregation function call is closely
coupled with AggNode. In the past few days, I have make the aggregation call can
run without AggNode (at least I tested sum(without finalized fn), avg (with finalized fn)).
But there are a few things to do, like acl check, anynull check and maybe more check.
also there are some MemoryContext mess up need to fix.
I still need some time for this goal, so I think the complex of it deserves another thread
to discuss it, any thought?
2. Shall we used the UniquePath as David suggested.
Actually I am trending to this way now. Daivd, can you share more insights about the
benefits of UniquePath? Costing size should be one of them, another one may be
changing the semi join to normal join as the current innerrel_is_unique did. any others?
3. Can we make the rule more general?
Currently it requires every relation yields a unique result. Daivd & Bapat provides another example:
select
m2.pk from m1, m2 where
m1.pk = m2.non_unqiue_key. That's interesting and not easy to
handle in my current framework. This is another reason I want to take the UniquePath framework.
Do we have any other rules to think about before implementing it?
Thanks for your feedback.
This should be ok. The time spent in annotating a RelOptInfo about
uniqueness is not going to be a lot. But doing so would help generic
elimination of Distinct/Group/Unique operations. What is
UniquePathKey; I didn't find this in your patch or in the code.
This is a proposal from David, so not in current patch/code :)
Regards
Andy Fan