Thread: Schemas vs. PostQUEL: resolving qualified identifiers
I'm starting to think about how to do SQL-compatible schemas in Postgres. One of the first issues that comes up is that Postgres has interpretations of qualified (dotted) names that conflict with what the standard says to do. This stuff is a hangover from Berkeley days and PostQUEL. Briefly, since there are no schemas in PG, anytime we have a dotted name we know that the first component *must* be a table name. Successive components are then resolved as either column or function names applied to whatever we have so far. For example, a.b.c is equivalent to c(a.b) if c is a function The equivalence works the other way too: you can write colname(tabname) and it will be taken as tabname.colname (though this seems to work only if tabname already has a rangetable entry). This is not going to fly together with SQL92, which wants a.b.c to mean schema a, table b, column c. I believe we can resolve the conflict without breaking any cases that are likely to be used much in practice. Here's my proposal: First, I'd like to allow the notation "table.*" to be used as a function argument, representing passing a whole-row value to a function (the function's argument would be declared as the table rowtype). Presently this is done by writing just the undecorated table name, but that is ambiguous if the same name is also used as a column name in the query. Also, we need this notation for use with qualified table names. Having done that, we can resolve names appearing in expressions thus: foo First try to resolve as unqualified column name; if not found, try to resolve as unqualified table name (in which case this is a whole-row value, equivalent to foo.*). foo.bar foo is an unqualified table name. Try first to resolve bar as a column name of foo; if not found, tryto resolve bar as a function taking the rowtype of foo. foo.bar.baz This refers to table bar in schema foo. baz is either a column or function name, as above. foo.bar.baz.quux Refers to table baz in schema bar in catalog foo. quux is either a column or function name, as above. foo.* foo is an unqualified table name; means whole-row value. foo.bar.* Whole row of table bar in schema foo. foo.bar.baz.* Whole row of table baz in schema bar in catalog foo. The first two of these rules are the same as current behavior; the next two are additions mandated by SQL92; the last three are a proposed extension to allow unambiguous reference to whole-row values. With these rules, we do not lose any functionality that we have now, but some PostQUEL-ish constructs will need to be rewritten into an equivalent form. PostQUEL-isms that still work: table.func equivalent to func(table.*) func(table) equivalent to func(table.*), as long as table name doesn't match any column name of the query PostQUEL-isms that no longer work: table.col.func Must rewrite as func(table.col) table.func1.func2 Must rewrite as, eg, func2(func1(table.*)) While there are a couple of examples of the above two constructs in the regression tests, I doubt they are used much in the real world. Another PostQUEL-ism that I would like to remove is col(tab) to mean tab.col; I'd prefer to restrict the function syntax to functions only. We could continue to support this for unqualified table names, but given the above rules it could not work for a qualified table name, eg, col(schema.tab) would be misinterpreted. Again, it seems unlikely that many people are using this, so we may as well try to regularize the syntax as much as we can. You may wonder why I'm still allowing the construction tab.func --- why not get rid of that too? Well, mainly because I'd like to support the Oracle-ish syntax for nextval: seqname.nextval. Together with the desire not to break existing code unnecessarily, that seems a good enough reason to leave it in. Comments? regards, tom lane
> PostQUEL-isms that no longer work: > > table.col.func Must rewrite as func(table.col) > > table.func1.func2 Must rewrite as, eg, func2(func1(table.*)) > > While there are a couple of examples of the above two constructs in the > regression tests, I doubt they are used much in the real world. Imho OK. I only need table.func1 . > Another PostQUEL-ism that I would like to remove is col(tab) to mean > tab.col; I'd prefer to restrict the function syntax to functions only. > We could continue to support this for unqualified table names, but > given the above rules it could not work for a qualified table name, > eg, col(schema.tab) would be misinterpreted. Again, it seems unlikely > that many people are using this, so we may as well try to regularize > the syntax as much as we can. Sounds ok too. Imho very hard to grok syntax anyway. What I do not like at all is the notion that "table" == "table".* . (IIRC there has already been some discussion where I objected to that.) "table" as function parameter imho passes an object of type "table" to the function. This involves type checking, and that the function only has one argument. "table".* to the contrary is not an object, but one object (one parameter) per table column. This is imho easier to understand, since select table.* also does it like that. Thus calling func(table.*) should imho rather be mapped to func (table.col1, table.col2 ...). Also remember that the standard now has room for nested types. create table atab ( x int, y int ); ( We use create table, standard iirc uses "create type".) create table btab ( position atab, info text ); How would you map btab.* here ? (x, y, info) ? Imho no. Imho it should map to (atab, info) thus you need the notion of passing an object anyway. Andreas
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > What I do not like at all is the notion that "table" == "table".* . > (IIRC there has already been some discussion where I objected to that.) > "table" as function parameter imho passes an object of type "table" > to the function. This involves type checking, and that the function only > has one argument. > "table".* to the contrary is not an object, but one object (one parameter) > per table column. This is imho easier to understand, since select table.* > also does it like that. Thus calling func(table.*) should imho rather be > mapped to func (table.col1, table.col2 ...). Okay, but then how will you refer unambiguously to the rowtype object? If you write func(schema.tab) the system will misinterpret it as func(tab.col) --- which, in the worst case, might fail to fail because there actually is such a column. We have to make some compromises here. I'm not all that thrilled with foo.* for rowtype either, but you haven't offered a workable alternative. regards, tom lane
Tom Lane wrote: > > "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > > What I do not like at all is the notion that "table" == "table".* . > > (IIRC there has already been some discussion where I objected to that.) > > > "table" as function parameter imho passes an object of type "table" > > to the function. This involves type checking, and that the function only > > has one argument. > > > "table".* to the contrary is not an object, but one object (one parameter) > > per table column. This is imho easier to understand, since select table.* > > also does it like that. Thus calling func(table.*) should imho rather be > > mapped to func (table.col1, table.col2 ...). > > Okay, but then how will you refer unambiguously to the rowtype object? > If you write func(schema.tab) the system will misinterpret it as > func(tab.col) --- which, in the worst case, might fail to fail because > there actually is such a column. We have to make some compromises here. > I'm not all that thrilled with foo.* for rowtype either, but you haven't > offered a workable alternative. > What about casting with the keyord ROW? func(ROW table) always refers to the row-type of table "table" even if there is a column called "table". -- Fernando Nasser Red Hat - Toronto E-Mail: fnasser@redhat.com 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9
Fernando Nasser <fnasser@redhat.com> writes: > Tom Lane wrote: >> Okay, but then how will you refer unambiguously to the rowtype object? > What about casting with the keyord ROW? > func(ROW table) > always refers to the row-type of table "table" even if there is > a column called "table". Strikes me as gratuituously different from the way everything else is done. We have .* and %ROWTYPE and so forth, and they're all suffixes. The closest analogy to your ROW syntax is CAST, but it doesn't alter the initial interpretation of its argument. I was toying with the notion of inventing some new notation liketable.** I don't like double-asterisk much, but maybe there's some other symbol we could use here? regards, tom lane
Tom Lane wrote: > > Fernando Nasser <fnasser@redhat.com> writes: > > Tom Lane wrote: > >> Okay, but then how will you refer unambiguously to the rowtype object? > > > What about casting with the keyord ROW? > > func(ROW table) > > always refers to the row-type of table "table" even if there is > > a column called "table". > > Strikes me as gratuituously different from the way everything else is > done. We have .* and %ROWTYPE and so forth, and they're all suffixes. > The closest analogy to your ROW syntax is CAST, but it doesn't alter the > initial interpretation of its argument. > I didn't mean literally that way, I just wanted to add a keyword for solving ambiguity (when there is one). You are right, it should be: func(table%ROWTYPE) -- Fernando Nasser Red Hat - Toronto E-Mail: fnasser@redhat.com 2323 Yonge Street, Suite #300 Toronto, Ontario M4P 2C9