Re: magical eref alias names - Mailing list pgsql-hackers

From Robert Haas
Subject Re: magical eref alias names
Date
Msg-id CA+TgmoY1MpT=Wf_kAFD6R3oah1J=nkp0jyXd8Ry55d_ip0Gs6A@mail.gmail.com
Whole thread Raw
In response to Re: magical eref alias names  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: magical eref alias names
List pgsql-hackers
On Sun, Dec 22, 2024 at 12:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Hmm, OK, that's useful. But I guess I'm still puzzled about the theory
> > here. A name like *VALUES* doesn't seem like it was created with the
> > idea of referring to it from some other part of your query. I do take
> > your point that it works and somebody's probably relying on it, but I
> > don't think you'd pick a name that requires quoting if you were trying
> > to make it easy to use that name in the query.
>
> As you say, some of this is lost in the mists of time; but I think the
> idea was specifically that these names should *not* be easy to type,
> because we don't want them to conflict with any relation alias that
> the user is likely to pick.  I'm fairly sure that the SQL spec
> actually has verbiage to the effect of "the implementation shall
> select a name that does not conflict with any other name in the query".

OK, but picking names that the user probably won't use is neither
necessary nor sufficient to guarantee uniqueness.

> > You might possibly also
> > try to generate names that are easy for users to guess, and distinct.
>
> Yeah, per spec they should be distinct; but somebody didn't bother
> with that early on, and now we've ended up in a situation where
> ruleutils.c does it instead.  I'm not sure that that's terribly evil.
> In particular, in a situation where we're trying to show a plan for
> a query with inlined views, EXPLAIN would probably have to have code
> to unique-ify the names anyway --- there's no way we're going to make
> these nonce names globally unique, so the view(s) might contain names
> that conflict with each other or the outer query.

When you say "there's no way we're going to make these nonce names
globally unique," is that because you think it would be too costly
from a performance perspective (which was my concern) or that you
think it's flat-out impossible for some reason (which doesn't seem to
me to be true)?

> > Yeah, I don't really want to be the one to break somebody's query that
> > explicitly references "*VALUES*" or whatever. At least not without a
> > better reason than I currently have. If this were just a display
> > artifact I think finding some way to clean it up would be pretty
> > worthwhile, but I would need a better reason to break working SQL.
>
> So it seems like we're coming to the conclusion that we don't want
> to change things here?
>
> The reason I want to push for a conclusion is that the discussion
> about "*VALUES*" over at [1] is on hold pending some decision here.
> The v3 patch in that thread is set up in such a way that it improves
> EXPLAIN output without breaking any existing SQL, and that's what
> I'd use if we refrain from changing things here.  But it's a tad
> inconsistent, so if we did decide it was okay to break some edge
> cases, I'd reconsider.

I think that if we can solve multiple problems all at once by breaking
some edge cases, that's worth considering. There probably aren't that
many people who have queries that reference the "*VALUES*" alias, so
if we wanted to change that to "values" I don't see that as completely
out of the question. Somebody will probably be inconvenienced but if
it solves other problems maybe it's worth it. But I don't think that
change by itself really helps anything here.

> Perhaps a similar idea could be applied to the other cases where
> we invent aliases, but it's less obvious how.  For instance in
>
>     SELECT ... UNION SELECT ...
>
> there's no obvious place where we could get names for the
> two sub-SELECTs.

If we had global uniqueness, or even per-subquery-level uniqueness,
this would sort itself out somewhat nicely, I think. We would just end
up with select_1 and select_2 or union_1 and union_2 or something like
that, and I think that would be strictly better than the status quo
where we do sometimes generate *SELECT* %d, but also sometimes just
*SELECT* and other times unnamed_subquery, and also only ever *VALUES*
and not *VALUES* %d.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Álvaro Herrera
Date:
Subject: Re: Allow NOT VALID foreign key constraints on partitioned tables.
Next
From: Robert Haas
Date:
Subject: Re: Alias of VALUES RTE in explain plan