On Sat, Oct 19, 2024 at 6:00 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> Generally, a hash value doesn't 100% guarantee the uniqueness of a node
> identification. Also, RelOptInfo corresponds to a subtree in the final
> plan, and sometimes, it takes work to find which node in the partially
> executed plan corresponds to this specific estimation on row number
> during selectivity estimation. Remember parameterised paths - you should
> attach some signature for each path. So, it is not fully strict method.
> If you are interested, I can perhaps explain the method a little bit
> more at some meetup.
Yeah, I agree that this is not the best method. While it's true that
you could get a false match in case of a hash value collision, IMHO
the bigger problem is that it seems like an expensive way of
determining something that we really should know already. If the user
types the same query, mentioning the same relations, in the same
order, with the same constructs around them, it's hard to believe that
hashing is the cheapest way of matching up the old and new ones. I'm
not sure exactly what we should do instead, but it feels like we more
or less have this information during parsing and then we lose track of
it as the query goes through the rewrite and planning phases.
--
Robert Haas
EDB: http://www.enterprisedb.com