Re: Controlling changes in plpgsql variable resolution - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Controlling changes in plpgsql variable resolution |
Date | |
Msg-id | 603c8f070910181253s7d5f9e61gd401715149c3f678@mail.gmail.com Whole thread Raw |
In response to | Controlling changes in plpgsql variable resolution (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Controlling changes in plpgsql variable resolution
|
List | pgsql-hackers |
On Sun, Oct 18, 2009 at 1:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > As most of you will recall, plpgsql currently acts as though identifiers > in SQL queries should be resolved first as plpgsql variable names, and > only failing that do they get processed as names of the query. The > plpgsql parser rewrite that I'm working on will fix that for the > obviously-silly cases where a plpgsql variable is substituted for a > table name or some other non-scalar-variable identifier. However, what > should we do when a name could represent either a plpgsql variable > or a column of the query? Historically we've resolved it as the > plpgsql variable, but we've sure heard a lot of complaints about that. > Oracle's PL/SQL has the precedence the other way around: resolve first > as the query column, and only failing that as a PL variable. The Oracle > behavior is arguably less surprising because the query-provided names > belong to the nearer enclosing scope. I believe that we ought to move > to the Oracle behavior over time, but how do we get there from here? > Changing it is almost surely going to break a lot of people's functions, > and in rather subtle ways. > > I think there are basically three behaviors that we could offer: > > 1. Resolve ambiguous names as plpgsql (historical PG behavior) > 2. Resolve ambiguous names as query column (Oracle behavior) > 3. Throw error if name is ambiguous (useful for finding problems) > > (Another possibility is to throw a warning but proceed anyway. It would > be easy to do that if we proceed with the Oracle behavior, but *not* > easy if we proceed with the historical PG behavior. The reason is that > the code invoked by transformColumnRef may have already made some > side-effects on the query tree. We discussed the implicit-RTE behavior > yesterday, but there are other effects of a successful name lookup, > such as marking columns for privilege checking.) > > What I'm wondering about at the moment is which behaviors to offer and > how to control them. The obvious answer is "use a GUC" but that answer > scares me because of the ease with which switching between #1 and #2 > would break plpgsql functions. It's not out of the question that that > could even amount to a security problem. I could see using a GUC to > turn the error behavior (#3) on and off, but not to switch between #1 > and #2. > > Another possibility is to control it on a per-function basis by adding > some special syntax to plpgsql function bodies to say which behavior > to use. We could for instance extend the never-documented "#option" > syntax. This is pretty ugly and would be inconvenient to use too > --- if people have to go and add "#option something" to a function, > they might as well just fix whatever name conflicts it has instead. > > I'm not seeing any choice that seems likely to make everybody happy. > Any comments or ideas? If we just change the default behavior from #1 to #2, it's going to be insanely easy to dump a database using pg_dump for 8.4, restore into an 8.5 database, and end up with a function that does something different and broken. So I'm opposed to that plan, but amenable to any of the other options in varying degrees. I think it would make a fair amount of sense to make #3 the default behavior. If possible, I think we should try to engineer things so that using pg_dump 8.5 on an 8.4 database and restoring the result into an 8.5 database produces a function with identical semantics. ...Robert
pgsql-hackers by date: