Re: magical eref alias names - Mailing list pgsql-hackers

From Tom Lane
Subject Re: magical eref alias names
Date
Msg-id 1050903.1734889532@sss.pgh.pa.us
Whole thread Raw
In response to Re: magical eref alias names  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
[ this thread seems to have stalled out, but we need to resolve it ]

Robert Haas <robertmhaas@gmail.com> writes:
>>> What I'm unhappy about is not being able to tell the difference
>>> between a name that was invented by or at least meaningful to the user
>>> and one that isn't.

>> You can already tell that, by looking to see whether
>> RTE->alias->aliasname exists.  eref is meant to be the resolved
>> name-to-use not the user's original input.

> Hmm, OK, that's useful. But I guess I'm still puzzled about the theory
> here. A name like *VALUES* doesn't seem like it was created with the
> idea of referring to it from some other part of your query. I do take
> your point that it works and somebody's probably relying on it, but I
> don't think you'd pick a name that requires quoting if you were trying
> to make it easy to use that name in the query.

As you say, some of this is lost in the mists of time; but I think the
idea was specifically that these names should *not* be easy to type,
because we don't want them to conflict with any relation alias that
the user is likely to pick.  I'm fairly sure that the SQL spec
actually has verbiage to the effect of "the implementation shall
select a name that does not conflict with any other name in the query".

> You might possibly also
> try to generate names that are easy for users to guess, and distinct.

Yeah, per spec they should be distinct; but somebody didn't bother
with that early on, and now we've ended up in a situation where
ruleutils.c does it instead.  I'm not sure that that's terribly evil.
In particular, in a situation where we're trying to show a plan for
a query with inlined views, EXPLAIN would probably have to have code
to unique-ify the names anyway --- there's no way we're going to make
these nonce names globally unique, so the view(s) might contain names
that conflict with each other or the outer query.

>> ... We're working with a ton of history here,
>> and I'm not really convinced that change will be change for the
>> better.

> Yeah, I don't really want to be the one to break somebody's query that
> explicitly references "*VALUES*" or whatever. At least not without a
> better reason than I currently have. If this were just a display
> artifact I think finding some way to clean it up would be pretty
> worthwhile, but I would need a better reason to break working SQL.

So it seems like we're coming to the conclusion that we don't want
to change things here?

The reason I want to push for a conclusion is that the discussion
about "*VALUES*" over at [1] is on hold pending some decision here.
The v3 patch in that thread is set up in such a way that it improves
EXPLAIN output without breaking any existing SQL, and that's what
I'd use if we refrain from changing things here.  But it's a tad
inconsistent, so if we did decide it was okay to break some edge
cases, I'd reconsider.

The reason that patch can (mostly) assign some other names to
"*VALUES*" without breaking things is that we treat a VALUES
clause as an implicit sub-select:

    SELECT * FROM (VALUES (1,2),...) v;

is parsed as though it were more or less

    SELECT * FROM (SELECT * FROM VALUES (1,2),... AS "*VALUES*") AS v;

and the "*VALUES*" alias is not referenceable except within that
implicit sub-SELECT.  The only place anybody notices it is in
EXPLAIN, which has to print the base relation alias because "v"
is usually gone due to flattening.  So we can change the base
relation alias to "v" without breaking anything in the original
query (except in some very weird edge cases), and then EXPLAIN
will talk about "v" not "*VALUES*".

Perhaps a similar idea could be applied to the other cases where
we invent aliases, but it's less obvious how.  For instance in

    SELECT ... UNION SELECT ...

there's no obvious place where we could get names for the
two sub-SELECTs.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAExHW5sh28_gwQP4%3DX4i4kMsAYaoSi3tsNse3LaTihV%3DeWuTcA%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Marcos Pegoraro
Date:
Subject: Re: Document NULL
Next
From: "David G. Johnston"
Date:
Subject: Re: Document NULL