On Fri, Jul 29, 2022 at 12:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> It would not be profitable to flatten the range table before we've
> done remove_useless_joins. We'd end up with useless entries from
> subqueries that ultimately aren't there. We could perhaps do it
> after we finish that phase, but I don't really see the point: it
> wouldn't be better than what we do now, just the same work at a
> different time.
That's not quite my question, though. Why do we ever build a non-flat
range table in the first place? Like, instead of assigning indexes
relative to the current subquery level, why not just assign them
relative to the whole query from the start? It can't really be that
we've done it this way because of remove_useless_joins(), because
we've been building separate range tables and later flattening them
for longer than join removal has existed as a feature.
What bugs me is that it's very much not free. By building a bunch of
separate range tables and combining them later, we generate extra
work: we have to go back and adjust RT indexes after-the-fact. We pay
that overhead for every query, not just the ones that end up with some
unused entries in the range table. And why would it matter if we did
end up with some useless entries in the range table, anyway? If
there's some semantic difference, we could add a flag to mark those
entries as needing to be ignored, which seems way better than crawling
all over the whole tree adjusting RTIs everywhere.
I don't really expect that we're ever going to change this -- and
certainly not on this thread. The idea of running around and replacing
RT indexes all over the tree is deeply embedded in the system. But are
we really sure we want to add a second kind of index that we have to
run around and adjust at the same time?
If we are, so be it, I guess. It just looks really ugly and unnecessary to me.
--
Robert Haas
EDB: http://www.enterprisedb.com