Re: magical eref alias names - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: magical eref alias names |
Date | |
Msg-id | CA+TgmobkyBFyXzi_d6QtcM55zixYMwGLkgQRTpdphrE2a-5OUw@mail.gmail.com Whole thread Raw |
In response to | Re: magical eref alias names (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: magical eref alias names
|
List | pgsql-hackers |
On Thu, Jan 2, 2025 at 3:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Global uniqueness across the database (not single queries) would be > needed to prevent cases where different views use the same generated > names. The only way I can see to do that without nasty performance > costs is to use something like an OID counter. Which would mean > that rather than nice names like "union_1", "union_2", etc, you'd > soon be looking at "union_5846926". I don't think anyone would > find that to be an improvement on what we're doing now. Oh, I agree, but I don't see why anyone would care whether rel names are unique across different queries. When I mentioned global uniqueness, I meant unique within a query, like what set_rtable_names() does after the fact. > > If we had global uniqueness, or even per-subquery-level uniqueness, > > this would sort itself out somewhat nicely, I think. We would just end > > up with select_1 and select_2 or union_1 and union_2 or something like > > that, and I think that would be strictly better than the status quo > > where we do sometimes generate *SELECT* %d, but also sometimes just > > *SELECT* and other times unnamed_subquery, and also only ever *VALUES* > > and not *VALUES* %d. > > I'll concede that it'd be nicer. But I'm not convinced it'd be enough > nicer to justify the costs of changing. We've been doing it this way > for a long time, and AFAIR you're the first to complain about it. I'm not sure it's enough nicer to justify the cost of changing, either. I do think that "you're the first to complain about it" is not a convincing argument, here or in general. Every time somebody reports a new problem, they are the first to complain about it, but that does make the problem any less real. The reason that I got interested in this problem was because of the thread about allowing extensions to control planner behavior. I wrote some words about it over there, but it's been a while so those thoughts might not be entirely up to date. What I've found is that it's a lot easier to insert a hook or three to allow an extension to control the planner behavior than it is to get those hooks to do anything useful, and that's precisely because it is hard to find a useful way to identify a particular part of the query plan. If the query planner were a person sitting next to you at your computer, you could point at the screen with your finger and say "hey, do you see this part of the EXPLAIN plan right here? could you try making this a sequential scan rather than an index scan?" and the planner could say "sure, let me re-plan it that way and I'll show you how it turns out". Since the planner is incorporeal and cannot see your finger, you want to identify "this part of the EXPLAIN plan right here" in some other way, like with a name, but the names known at parsing time and planning time are not unique and need not match what shows up in the EXPLAIN output. Concretely, if the user creates a partitioned table called *VALUES* with several partitions and then creates another such table, also called *VALUES*, in a different schema, and then joins the two of them together; and then does UNION ALL with a subquery that actually uses VALUES (), and then somewhere includes a subquery that also queries one of the tables actually called *VALUES*, you cannot meaningfully use the label *VALUES* to identify one particular scan; and you can't use anything that appears in the EXPLAIN output for that purpose either because the next planning cycle won't have those labels available *during planning* when it needs to honor whatever plan-shape requests were made. Now, I'm firmly convinced that this is a real problem and worth solving, but let me be clear that I don't think the solution is anywhere on this thread, nor do I think that it is simple. My original proposal of getting rid of system-generated fake names isn't necessary, because you very helpfully pointed out that I can look at whether RTE->alias->aliasname exists to figure that out. Also, enforcing global uniqueness across the query at parse time wouldn't actually help, because when we apply rewriter rules we can introduce new relations into the query, and then when we expand inheritance hierarchies we can introduce even more new relations into the query; and what the user will care about if they want to modify "that portion of the plan right there" is what ultimately ends up in the plan, not what was in the query at parse time. So as far as I am concerned, this thread has served the useful purpose of making me smarter and I can't really see a way for it to do anything more than that; but if it's given you a clever idea for something that we could change in the code, I'm certainly happy to hear about that. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: