Thread: Re: [GENERAL] Creating temp tables inside read only transactions
Guillaume Lelarge wrote [on pgsql-general]: > On Thu, 2011-07-07 at 16:01 +0000, mike beeper wrote [on pgsql-general]: >> I have a function that creates a temp table, populate it with results >> during intermediate processing, and reads from it at the end. When >> the transaction is marked as read only, it does not allow creation of >> temp table, even though there are no permanent writes to the db. Are >> there any workarounds? The following block errors out. >> >> SET TRANSACTION ISOLATION LEVEL READ COMMITTED READ ONLY; >> create temp table test(test int); > > When you create a temporary table, PostgreSQL needs to add rows in > pg_class, pg_attribute, and probably other system catalogs. So there are > writes, which aren't possible in a read-only transaction. Hence the > error. And no, there is no workaround. That sounds like a deficiency to overcome. It should be possible for those system catalogs to be virtual, defined like union views over similar immutable tables for the read-only database plus mutable in-memory ones for the temporary tables. Are there any plans in the works to do this? On the other hand, if one can have lexical-scope tables (table-typed routine variables), and I know Pg 8.4+ has named subqueries which handle a lot of cases where temp tables would otherwise be used, I would certainly expect those to work when you're dealing with a readonly database. -- Darren Duncan
On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote: > > When you create a temporary table, PostgreSQL needs to add rows in > > pg_class, pg_attribute, and probably other system catalogs. So there are > > writes, which aren't possible in a read-only transaction. Hence the > > error. And no, there is no workaround. > > That sounds like a deficiency to overcome. > > It should be possible for those system catalogs to be virtual, defined like > union views over similar immutable tables for the read-only database plus > mutable in-memory ones for the temporary tables. Ideally, yes, from a logical standpoint there are catalog entries that are only interesting to one backend. But that doesn't mean it's easy to do. Remember that catalog lookups (even though most go through a cache) are a path that is important to performance. Also, more complex catalog interpretations may introduce some extra bootstrapping challenges. > Are there any plans in the works to do this? I don't think so. It sounds like some fairly major work for a comparatively minor benefit. Suggestions welcome, of course, to either make the work look more minor or the benefits look more major ;) Regards, Jeff Davis
Jeff Davis wrote: > On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote: >>> When you create a temporary table, PostgreSQL needs to add rows in >>> pg_class, pg_attribute, and probably other system catalogs. So there are >>> writes, which aren't possible in a read-only transaction. Hence the >>> error. And no, there is no workaround. >> That sounds like a deficiency to overcome. >> >> It should be possible for those system catalogs to be virtual, defined like >> union views over similar immutable tables for the read-only database plus >> mutable in-memory ones for the temporary tables. > > Ideally, yes, from a logical standpoint there are catalog entries that > are only interesting to one backend. > > But that doesn't mean it's easy to do. Remember that catalog lookups > (even though most go through a cache) are a path that is important to > performance. Also, more complex catalog interpretations may introduce > some extra bootstrapping challenges. > >> Are there any plans in the works to do this? > > I don't think so. It sounds like some fairly major work for a > comparatively minor benefit. > > Suggestions welcome, of course, to either make the work look more minor > or the benefits look more major ;) What I said before was a simplification; below I present my real proposal. I think an even better way to support this is would be based on Postgres having support for directly using multiple databases within the same SQL session at once, as if namespaces were another level deep, the first level being the databases, the second level the schemas, and the third level the schema objects. Kind of like what the SQL standard defines its catalog/schema/object namespaces. This instead of needing to use federating or that contrib module to use multiple Pg databases of the same cluster at once. Under this scenario, we make the property of a database being read-only or read-write for the current SQL session associated with a database rather than the whole SQL session. A given transaction can read from any database but can only make changes to the ones not read-only. Also, the proper way to do temporary tables would be to put them in another database than the main one, where the whole other database has the property of being temporary. Under this scenario, there would be separate system catalogs for each database, and so the ones for read-only databases are read-only, and the ones for other databases aren't. Then the system catalog itself fundamentally isn't more complicated, per database, and anything extra to handle cross-database queries or whatever, if anything, is a separate layer. Code that only deals with a single database at once would be an optimized situation and perform no worse than it does now. Furthermore, federating databases is done with the same interface, by adding remote/foreign databases as extra databases at the top level namespace. Fundamentally, a SQL session would be associated with a Pg server, not a database managed by such. When one starts a SQL session, there are initially no databases visible to them, and the top-level namespace is empty. They then "mount" a database, similarly to how one mounts an OS filesystem, by providing appropriate connection info, either just the database name or also user/pass or also remote host etc as is applicable, these details being the difference between using a local/same-Pg-cluster db or a remote/federated one, and the details also say whether it is temporary or initially read-only etc. See also how SQLite works; this "mount" being analogous to their "attach". Such a paradigm is also how my Muldis D language interfaces databases; this is the most flexible, portable, extensible, optimizable, and elegant approach I can think of. -- Darren Duncan
On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote: > I think an even better way to support this is would be based on Postgres having > support for directly using multiple databases within the same SQL session at > once, as if namespaces were another level deep, the first level being the > databases, the second level the schemas, and the third level the schema objects. > > Kind of like what the SQL standard defines its catalog/schema/object namespaces. > > This instead of needing to use federating or that contrib module to use multiple > Pg databases of the same cluster at once. > > Under this scenario, we make the property of a database being read-only or > read-write for the current SQL session associated with a database rather than > the whole SQL session. A given transaction can read from any database but can > only make changes to the ones not read-only. > > Also, the proper way to do temporary tables would be to put them in another > database than the main one, where the whole other database has the property of > being temporary. > > Under this scenario, there would be separate system catalogs for each database, > and so the ones for read-only databases are read-only, and the ones for other > databases aren't. > > Then the system catalog itself fundamentally isn't more complicated, per > database, and anything extra to handle cross-database queries or whatever, if > anything, is a separate layer. Code that only deals with a single database at > once would be an optimized situation and perform no worse than it does now. One challenge that jumps to mind here is that an Oid would need to become a pair (catalog, oid). Even if the end result isn't much more complex, getting there is not trivial. > See also how SQLite works; this "mount" being analogous to their "attach". I'm not sure SQLite is the best example. It has a radically different architecture. Regards, Jeff Davis
Jeff Davis wrote: > On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote: >> I think an even better way to support this is would be based on Postgres having >> support for directly using multiple databases within the same SQL session at >> once, as if namespaces were another level deep, the first level being the >> databases, the second level the schemas, and the third level the schema objects. <snip> >> Then the system catalog itself fundamentally isn't more complicated, per >> database, and anything extra to handle cross-database queries or whatever, if >> anything, is a separate layer. Code that only deals with a single database at >> once would be an optimized situation and perform no worse than it does now. > > One challenge that jumps to mind here is that an Oid would need to > become a pair (catalog, oid). Even if the end result isn't much more > complex, getting there is not trivial. Yes, but that would just be in-memory or in temporary places external to every database. On disk internal to a database there would just be the oid. In fact, another aspect of the database model I defined is that each "database" is entirely self-contained; while you can do cross-database queries, you don't have cross-database constraints, in the general case. >> See also how SQLite works; this "mount" being analogous to their "attach". > > I'm not sure SQLite is the best example. It has a radically different > architecture. Still, its an example I know of where you can access several clearly separable databases at once through a common namespace. While one might argue this is a substitute for multiple schema support, I don't because with multiple schemas you can have integrity constraints that cross schemas. The namespaces issue is largely orthogonal to self-containment or integrity in my model. But look at Oracle too, at least how I understand it. Oracle supports "CONNECT TO ... AUTHORIZE ..."/etc SQL, meaning you can define what databases you are accessing within the SQL session, rather than having to do it externally. I assume that Oracle's features correspond somewhat to my proposal, and so enable cross-database queries in the illusion that several databases are one. Suffice it to say, I have thought through my proposed model for years, with one of its (and Muldis D's) express purposes in providing a common normalized paradigm that all the existing SQL DBMSs can map to with consistent behavior whether Oracle or SQLite, and I haven't stated all of it here (a lot more is in my published language spec). Key mapping points are the boundaries of a database's self-definability. And namespace nesting is actually arbitrary-depth, so accounting for everything from no native schema support to schema plus "package" namespace support. -- Darren Duncan
On Fri, Jul 8, 2011 at 2:21 AM, Darren Duncan <darren@darrenduncan.net> wrote: > I think an even better way to support this is would be based on Postgres > having support for directly using multiple databases within the same SQL > session at once, as if namespaces were another level deep, the first level > being the databases, the second level the schemas, and the third level the > schema objects. > > Kind of like what the SQL standard defines its catalog/schema/object > namespaces. > > This instead of needing to use federating or that contrib module to use > multiple Pg databases of the same cluster at once. But if that's what you want, just don't put your data in different databases in the first place. That's what schemas are for. If for some reason we needed to have tables that happened to be called x.y.z and a.b.c accessible from a single SQL session, we could allow that much more simply by allowing schemas to be nested. Then we could allow arbitrary numbers of levels, not just three. The whole point of having databases and schemas as separate objects is that they do different things: schemas are just containers for names, allowing common access to data, and databases are completely separate entities, allowing privilege separation for (say) a multi-tenant hosting environment. We're not going to throw out the latter concept just so people can use two dots in their table names instead of one. > Under this scenario, we make the property of a database being read-only or > read-write for the current SQL session associated with a database rather > than the whole SQL session. A given transaction can read from any database > but can only make changes to the ones not read-only. > > Also, the proper way to do temporary tables would be to put them in another > database than the main one, where the whole other database has the property > of being temporary. > > Under this scenario, there would be separate system catalogs for each > database, and so the ones for read-only databases are read-only, and the > ones for other databases aren't. > > Then the system catalog itself fundamentally isn't more complicated, per > database, and anything extra to handle cross-database queries or whatever, > if anything, is a separate layer. Code that only deals with a single > database at once would be an optimized situation and perform no worse than > it does now. I think you should make more of an effort to understand how the system works now, and why, before proposing radical redesigns. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > But if that's what you want, just don't put your data in different > databases in the first place. That's what schemas are for. > > If for some reason we needed to have tables that happened to be called > x.y.z and a.b.c accessible from a single SQL session, we could allow > that much more simply by allowing schemas to be nested. Then we could > allow arbitrary numbers of levels, not just three. The whole point of > having databases and schemas as separate objects is that they do > different things: schemas are just containers for names, allowing > common access to data, and databases are completely separate entities, > allowing privilege separation for (say) a multi-tenant hosting > environment. We're not going to throw out the latter concept just so > people can use two dots in their table names instead of one. I agree with what you're saying in general and that schema namespaces should be nestable to arbitrary levels. One dot or two isn't an issue I have. Dividing based on databases or on schemas is a big and important distinction. I see that the semantic purpose of using multiple databases is to allow things to be completely independent and self-defined, where one can understand the meaning of any one database in isolation. So one can take each of the 2 databases and walk off with them in opposite directions, and each can still be used and understood. Whereas, schemas are namespaces for organizing entities within a single database where any of those entities may be interdependent, such as defining a data type in one schema and using it as the declared type with a routine or table or constraint in another. But just because you use multiple databases in order for them to be independent, sometimes one still wants to use them together, and an abstraction loosely like federating is useful here. > I think you should make more of an effort to understand how the system > works now, and why, before proposing radical redesigns. Well yes, of course. But that will take time and I think I already understand enough about it to make some useful contributions in the meantime. How much or what I already know may not always come across well. If this bothers people then I can make more of an effort to reduce my input until I have more solid things to back them up. -- Darren Duncan
Robert Haas <robertmhaas@gmail.com> writes: > If for some reason we needed to have tables that happened to be called > x.y.z and a.b.c accessible from a single SQL session, we could allow > that much more simply by allowing schemas to be nested. Then we could > allow arbitrary numbers of levels, not just three. FWIW, I actually tried to do that back when we first introduced schema support (the fact that the code calls them namespaces and not schemas is a leftover from that idea). It turns out to be a whole lot harder than it sounds, because of the ambiguity you get about which name goes at what level. A simple example of this is: if you write "x.y" in a query, is that meant to be table x's column y, or is it meant to be field y within a composite column x of some table in the query? We've resolved that by requiring you to write "(x).y" when you mean the latter, but it's not exactly an intuitive or pleasant answer. In the same way, if namespaces can be nested to different levels, it gets really messy to support abbreviations of any sort --- but the SQL spec requires us to be able to do so. regards, tom lane
On Fri, 2011-07-08 at 12:34 -0700, Darren Duncan wrote: > Yes, but that would just be in-memory or in temporary places external to every > database. On disk internal to a database there would just be the oid. In fact, > another aspect of the database model I defined is that each "database" is > entirely self-contained; while you can do cross-database queries, you don't have > cross-database constraints, in the general case. Yes, you can have a "local oid" and a "fully-qualified oid". It sounds like it might take some effort (which is an understatement) to go through the system and figure out which ones should be local and which ones should be fully-qualified. Regards, Jeff Davis
Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> If for some reason we needed to have tables that happened to be called >> x.y.z and a.b.c accessible from a single SQL session, we could allow >> that much more simply by allowing schemas to be nested. Then we could >> allow arbitrary numbers of levels, not just three. > > FWIW, I actually tried to do that back when we first introduced schema > support (the fact that the code calls them namespaces and not schemas > is a leftover from that idea). It turns out to be a whole lot harder > than it sounds, because of the ambiguity you get about which name goes > at what level. A simple example of this is: if you write "x.y" in a > query, is that meant to be table x's column y, or is it meant to be > field y within a composite column x of some table in the query? > We've resolved that by requiring you to write "(x).y" when you mean > the latter, but it's not exactly an intuitive or pleasant answer. > In the same way, if namespaces can be nested to different levels, > it gets really messy to support abbreviations of any sort --- but > the SQL spec requires us to be able to do so. What if you used the context of the calling code and resolve in favor of whatever match is closest to it? The problem is related to general-purpose programming languages. Basically start looking in the lexical context for an "x" and if you find one use that; otherwise, assuming we're talking about referencing code that lives in the database such as a function, look at the innermost schema containing the referencing code and see if it has a direct child named "x"; otherwise go up one level to a parent schema, and so on until you get to the top, and finding none by then say it doesn't exist. If there are several "x" in this search sequence, only use the first one regardless of whether it has a "y", so to prevent bugs from too much complexity. Same for just looking for "x" by itself in fact, not just an "x.y". For the case of calling code that doesn't live in the database such as a client-side query, I believe there are session variables like "current schema" or such, and you can use this as the starting point for the search for "x", looking first at what that schema directly contains, and then its parent, and so on. Something like that. Or ignore what I said about starting in a lexical context and do what you already do there, but keep what I said about relative order of schemas to search, only searching direct children of ancestors of the current code's context schema starting with the current context. You could also come up with some "relative name" syntax such as filesystems support with their ../ and such, but that's further from standard SQL. -- Darren Duncan
On Fri, 2011-07-08 at 21:04 -0700, Darren Duncan wrote: > > I think you should make more of an effort to understand how the system > > works now, and why, before proposing radical redesigns. > > Well yes, of course. But that will take time and I think I already understand > enough about it to make some useful contributions in the meantime. How much or > what I already know may not always come across well. If this bothers people > then I can make more of an effort to reduce my input until I have more solid > things to back them up. I don't think anyone expects you to understand all the internal APIs in postgres before you make a proposal. But we do expect you to look critically at your own proposals with the status quo (i.e. existing code, users, and standards) in mind. And that probably means poking at the code a little to see if you find stumbling blocks, and asking questions to try to trace out the shape of the project. I'm hoping that we can learn a lot from your work on Muldis D. In particular, the type system might be the most fertile ground -- you've clearly done some interesting things there, and I think we've felt some pressure to improve the type system from a number of different projects*. Regards, Jeff Davis * That being said, PostgreSQL's type system is actually very good. Consider the sophisticated type infrastructure (or at least plumbing around the type system) required to make KNN-GiST work, for instance.
On Fri, 2011-07-08 at 23:39 -0700, Darren Duncan wrote: > What if you used the context of the calling code and resolve in favor of > whatever match is closest to it? The problem is related to general-purpose > programming languages. > > Basically start looking in the lexical context for an "x" and if you find one > use that; otherwise, assuming we're talking about referencing code that lives in > the database such as a function, look at the innermost schema containing the > referencing code and see if it has a direct child named "x"; otherwise go up one > level to a parent schema, and so on until you get to the top, and finding none > by then say it doesn't exist. This is an example of where data languages and normal programming languages have a crucial difference. With a data language, you have this problem: 1. An application uses a query referencing 'y.z.foo' that resolves to internal object with fully-qualified name 'x.y.z'. 2. An administrator creates object 'y.z.foo'. Now, the application breaks all of a sudden. In a normal prgramming language, if the schema of the two "foo"s are different, the compiler could probably catch the error. SQL really has no hope of catching it though. PostgreSQL has this problem now in a couple ways, but it's much easier to grasp what you might be conflicting with. If you have multiple nested levels to traverse and different queries using different levels of qualification, it gets a little more messy and I think a mistake is more likely. Regards, Jeff Davis
Jeff Davis wrote: > On Fri, 2011-07-08 at 23:39 -0700, Darren Duncan wrote: >> What if you used the context of the calling code and resolve in favor of >> whatever match is closest to it? The problem is related to general-purpose >> programming languages. >> >> Basically start looking in the lexical context for an "x" and if you find one >> use that; otherwise, assuming we're talking about referencing code that lives in >> the database such as a function, look at the innermost schema containing the >> referencing code and see if it has a direct child named "x"; otherwise go up one >> level to a parent schema, and so on until you get to the top, and finding none >> by then say it doesn't exist. > > This is an example of where data languages and normal programming > languages have a crucial difference. > > With a data language, you have this problem: > 1. An application uses a query referencing 'y.z.foo' that resolves to > internal object with fully-qualified name 'x.y.z'. > 2. An administrator creates object 'y.z.foo'. > > Now, the application breaks all of a sudden. > > In a normal prgramming language, if the schema of the two "foo"s are > different, the compiler could probably catch the error. SQL really has > no hope of catching it though. > > PostgreSQL has this problem now in a couple ways, but it's much easier > to grasp what you might be conflicting with. If you have multiple nested > levels to traverse and different queries using different levels of > qualification, it gets a little more messy and I think a mistake is more > likely. Well, my search path suggestion was based on Tom Lane's comment that "the SQL spec requires us to be able to [support abbreviations]" and I expected it would be syntactically and semantically backwards compatible with how things work now. FYI, with Muldis D, being more green fields, there are no search paths in the general case, and every entity reference is unambiguous because it has to be fully-qualified. However, I also support relative references, and in fact require their use for references within the same database, which carries a number of benefits, at the cost of being a few characters more verbose than when using a search path. So introducing new things with the same names in different namespaces won't break anything there, even if they are "closer". Its essentially like navigating a Unix filesystem but with "." rather than "/". So for example, if you had 2 sibling schemas "s1" and "s2", each with 2 functions "f1","f2" and a table "t", then s1.f1 would reference s1.f2 and s1.t as sch.lib.f2 and sch.data.t respectively, while s1.f1 would refer to the entities in s2 as sch.par.s2.lib.f1 and sch.par.s2.data.t and such (a function can also refer to itself anonymously as "rtn" if it's recursive). The "sch" is like "." in Unix and the "par" is like ".." in Unix. The "data" is for data tables or views (and "cat" is for catalog tables/views) while "lib" is for user-defined types, routines, constraints, etc (and "sys" is for built-in types and routines, but "sys" may be omitted and search paths exist just for built-ins). Synonyms are also supported. I don't expect you would adopt relative (fully-qualified) references, because the syntax isn't in standard SQL (I think), but I did. Unless you like them and can come up with a syntax that will fit into how SQL does things. -- Darren Duncan
I like Darren's proposal. It is elegant.
> Date: Fri, 8 Jul 2011 18:38:59 +1200
> From: GavinFlower@archidevsys.co.nz
> To: darren@darrenduncan.net
> CC: pgsql@j-davis.com; guillaume@lelarge.info; mbeeper@hotmail.com; pgsql-general@postgresql.org; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
>
> On 08/07/11 18:21, Darren Duncan wrote:
> > Jeff Davis wrote:
> >> On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
> >>>> When you create a temporary table, PostgreSQL needs to add rows in
> >>>> pg_class, pg_attribute, and probably other system catalogs. So
> >>>> there are
> >>>> writes, which aren't possible in a read-only transaction. Hence the
> >>>> error. And no, there is no workaround.
> >>> That sounds like a deficiency to overcome.
> >>>
> >>> It should be possible for those system catalogs to be virtual,
> >>> defined like union views over similar immutable tables for the
> >>> read-only database plus mutable in-memory ones for the temporary
> >>> tables.
> >>
> >> Ideally, yes, from a logical standpoint there are catalog entries that
> >> are only interesting to one backend.
> >>
> >> But that doesn't mean it's easy to do. Remember that catalog lookups
> >> (even though most go through a cache) are a path that is important to
> >> performance. Also, more complex catalog interpretations may introduce
> >> some extra bootstrapping challenges.
> >>
> >>> Are there any plans in the works to do this?
> >>
> >> I don't think so. It sounds like some fairly major work for a
> >> comparatively minor benefit.
> >>
> >> Suggestions welcome, of course, to either make the work look more minor
> >> or the benefits look more major ;)
> >
> > What I said before was a simplification; below I present my real
> > proposal.
> >
> > I think an even better way to support this is would be based on
> > Postgres having support for directly using multiple databases within
> > the same SQL session at once, as if namespaces were another level
> > deep, the first level being the databases, the second level the
> > schemas, and the third level the schema objects.
> >
> > Kind of like what the SQL standard defines its catalog/schema/object
> > namespaces.
> >
> > This instead of needing to use federating or that contrib module to
> > use multiple Pg databases of the same cluster at once.
> >
> > Under this scenario, we make the property of a database being
> > read-only or read-write for the current SQL session associated with a
> > database rather than the whole SQL session. A given transaction can
> > read from any database but can only make changes to the ones not
> > read-only.
> >
> > Also, the proper way to do temporary tables would be to put them in
> > another database than the main one, where the whole other database has
> > the property of being temporary.
> >
> > Under this scenario, there would be separate system catalogs for each
> > database, and so the ones for read-only databases are read-only, and
> > the ones for other databases aren't.
> >
> > Then the system catalog itself fundamentally isn't more complicated,
> > per database, and anything extra to handle cross-database queries or
> > whatever, if anything, is a separate layer. Code that only deals with
> > a single database at once would be an optimized situation and perform
> > no worse than it does now.
> >
> > Furthermore, federating databases is done with the same interface, by
> > adding remote/foreign databases as extra databases at the top level
> > namespace.
> >
> > Fundamentally, a SQL session would be associated with a Pg server, not
> > a database managed by such. When one starts a SQL session, there are
> > initially no databases visible to them, and the top-level namespace is
> > empty.
> >
> > They then "mount" a database, similarly to how one mounts an OS
> > filesystem, by providing appropriate connection info, either just the
> > database name or also user/pass or also remote host etc as is
> > applicable, these details being the difference between using a
> > local/same-Pg-cluster db or a remote/federated one, and the details
> > also say whether it is temporary or initially read-only etc.
> >
> > See also how SQLite works; this "mount" being analogous to their
> > "attach".
> >
> > Such a paradigm is also how my Muldis D language interfaces databases;
> > this is the most flexible, portable, extensible, optimizable, and
> > elegant approach I can think of.
> >
> > -- Darren Duncan
> >
> I would suggest that the default action for psql would be as now,
> associate the session with a database in the name of the current O/S user.
>
> However, use a new psql flag, such as '-unattached' or '-N', to indicate
> that no database is to be attached when psql starts up.
>
> While I don't have a current need for what you propose, it does look
> interesting and potentially useful to me.
>
> From: GavinFlower@archidevsys.co.nz
> To: darren@darrenduncan.net
> CC: pgsql@j-davis.com; guillaume@lelarge.info; mbeeper@hotmail.com; pgsql-general@postgresql.org; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
>
> On 08/07/11 18:21, Darren Duncan wrote:
> > Jeff Davis wrote:
> >> On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
> >>>> When you create a temporary table, PostgreSQL needs to add rows in
> >>>> pg_class, pg_attribute, and probably other system catalogs. So
> >>>> there are
> >>>> writes, which aren't possible in a read-only transaction. Hence the
> >>>> error. And no, there is no workaround.
> >>> That sounds like a deficiency to overcome.
> >>>
> >>> It should be possible for those system catalogs to be virtual,
> >>> defined like union views over similar immutable tables for the
> >>> read-only database plus mutable in-memory ones for the temporary
> >>> tables.
> >>
> >> Ideally, yes, from a logical standpoint there are catalog entries that
> >> are only interesting to one backend.
> >>
> >> But that doesn't mean it's easy to do. Remember that catalog lookups
> >> (even though most go through a cache) are a path that is important to
> >> performance. Also, more complex catalog interpretations may introduce
> >> some extra bootstrapping challenges.
> >>
> >>> Are there any plans in the works to do this?
> >>
> >> I don't think so. It sounds like some fairly major work for a
> >> comparatively minor benefit.
> >>
> >> Suggestions welcome, of course, to either make the work look more minor
> >> or the benefits look more major ;)
> >
> > What I said before was a simplification; below I present my real
> > proposal.
> >
> > I think an even better way to support this is would be based on
> > Postgres having support for directly using multiple databases within
> > the same SQL session at once, as if namespaces were another level
> > deep, the first level being the databases, the second level the
> > schemas, and the third level the schema objects.
> >
> > Kind of like what the SQL standard defines its catalog/schema/object
> > namespaces.
> >
> > This instead of needing to use federating or that contrib module to
> > use multiple Pg databases of the same cluster at once.
> >
> > Under this scenario, we make the property of a database being
> > read-only or read-write for the current SQL session associated with a
> > database rather than the whole SQL session. A given transaction can
> > read from any database but can only make changes to the ones not
> > read-only.
> >
> > Also, the proper way to do temporary tables would be to put them in
> > another database than the main one, where the whole other database has
> > the property of being temporary.
> >
> > Under this scenario, there would be separate system catalogs for each
> > database, and so the ones for read-only databases are read-only, and
> > the ones for other databases aren't.
> >
> > Then the system catalog itself fundamentally isn't more complicated,
> > per database, and anything extra to handle cross-database queries or
> > whatever, if anything, is a separate layer. Code that only deals with
> > a single database at once would be an optimized situation and perform
> > no worse than it does now.
> >
> > Furthermore, federating databases is done with the same interface, by
> > adding remote/foreign databases as extra databases at the top level
> > namespace.
> >
> > Fundamentally, a SQL session would be associated with a Pg server, not
> > a database managed by such. When one starts a SQL session, there are
> > initially no databases visible to them, and the top-level namespace is
> > empty.
> >
> > They then "mount" a database, similarly to how one mounts an OS
> > filesystem, by providing appropriate connection info, either just the
> > database name or also user/pass or also remote host etc as is
> > applicable, these details being the difference between using a
> > local/same-Pg-cluster db or a remote/federated one, and the details
> > also say whether it is temporary or initially read-only etc.
> >
> > See also how SQLite works; this "mount" being analogous to their
> > "attach".
> >
> > Such a paradigm is also how my Muldis D language interfaces databases;
> > this is the most flexible, portable, extensible, optimizable, and
> > elegant approach I can think of.
> >
> > -- Darren Duncan
> >
> I would suggest that the default action for psql would be as now,
> associate the session with a database in the name of the current O/S user.
>
> However, use a new psql flag, such as '-unattached' or '-N', to indicate
> that no database is to be attached when psql starts up.
>
> While I don't have a current need for what you propose, it does look
> interesting and potentially useful to me.
>
On 08/07/11 18:21, Darren Duncan wrote: > Jeff Davis wrote: >> On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote: >>>> When you create a temporary table, PostgreSQL needs to add rows in >>>> pg_class, pg_attribute, and probably other system catalogs. So >>>> there are >>>> writes, which aren't possible in a read-only transaction. Hence the >>>> error. And no, there is no workaround. >>> That sounds like a deficiency to overcome. >>> >>> It should be possible for those system catalogs to be virtual, >>> defined like union views over similar immutable tables for the >>> read-only database plus mutable in-memory ones for the temporary >>> tables. >> >> Ideally, yes, from a logical standpoint there are catalog entries that >> are only interesting to one backend. >> >> But that doesn't mean it's easy to do. Remember that catalog lookups >> (even though most go through a cache) are a path that is important to >> performance. Also, more complex catalog interpretations may introduce >> some extra bootstrapping challenges. >> >>> Are there any plans in the works to do this? >> >> I don't think so. It sounds like some fairly major work for a >> comparatively minor benefit. >> >> Suggestions welcome, of course, to either make the work look more minor >> or the benefits look more major ;) > > What I said before was a simplification; below I present my real > proposal. > > I think an even better way to support this is would be based on > Postgres having support for directly using multiple databases within > the same SQL session at once, as if namespaces were another level > deep, the first level being the databases, the second level the > schemas, and the third level the schema objects. > > Kind of like what the SQL standard defines its catalog/schema/object > namespaces. > > This instead of needing to use federating or that contrib module to > use multiple Pg databases of the same cluster at once. > > Under this scenario, we make the property of a database being > read-only or read-write for the current SQL session associated with a > database rather than the whole SQL session. A given transaction can > read from any database but can only make changes to the ones not > read-only. > > Also, the proper way to do temporary tables would be to put them in > another database than the main one, where the whole other database has > the property of being temporary. > > Under this scenario, there would be separate system catalogs for each > database, and so the ones for read-only databases are read-only, and > the ones for other databases aren't. > > Then the system catalog itself fundamentally isn't more complicated, > per database, and anything extra to handle cross-database queries or > whatever, if anything, is a separate layer. Code that only deals with > a single database at once would be an optimized situation and perform > no worse than it does now. > > Furthermore, federating databases is done with the same interface, by > adding remote/foreign databases as extra databases at the top level > namespace. > > Fundamentally, a SQL session would be associated with a Pg server, not > a database managed by such. When one starts a SQL session, there are > initially no databases visible to them, and the top-level namespace is > empty. > > They then "mount" a database, similarly to how one mounts an OS > filesystem, by providing appropriate connection info, either just the > database name or also user/pass or also remote host etc as is > applicable, these details being the difference between using a > local/same-Pg-cluster db or a remote/federated one, and the details > also say whether it is temporary or initially read-only etc. > > See also how SQLite works; this "mount" being analogous to their > "attach". > > Such a paradigm is also how my Muldis D language interfaces databases; > this is the most flexible, portable, extensible, optimizable, and > elegant approach I can think of. > > -- Darren Duncan > I would suggest that the default action for psql would be as now, associate the session with a database in the name of the current O/S user. However, use a new psql flag, such as '-unattached' or '-N', to indicate that no database is to be attached when psql starts up. While I don't have a current need for what you propose, it does look interesting and potentially useful to me.
On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
But if that's what you want, just don't put your data in different
databases in the first place. That's what schemas are for.
Sadly, DBAs don't always have the ability to put all their data in one database, even if that is what schemas are for.
The ability to do cross-database (most likely cross-server as well) queries would address a lot of real-world problems.
-
Mike Nolan
nolan@tssi.com
On Mon, Jul 11, 2011 at 12:01 AM, Michael Nolan <htfoot@gmail.com> wrote: > > > On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> But if that's what you want, just don't put your data in different >> databases in the first place. That's what schemas are for. > > Sadly, DBAs don't always have the ability to put all their data in one > database, even if that is what schemas are for. > > The ability to do cross-database (most likely cross-server as well) queries > would address a lot of real-world problems. Sure, there's quite a lot of use to that sort of thing. Once 9.1 is out, with the FDW (Foreign Data Wrapper) concept, it'll make a lot of sense to add FDW methods for various sorts of access methods, including accessing other PG instances, cross-server. We already have a fine mechanism for this; no need to create some different mechanism out of whole cloth. Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1", there's a pretty good reason NOT to support that, namely that this breaks relational handling of tables. PostgreSQL is a *relational* database system, hence it's preferable for structures to be relational, as opposed to hierarchical, which is what any of the suggested nestings are. Having to write recursive queries just to look for fully qualified table names is much more of a bug than it is a feature. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Christopher Browne wrote: > Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1", > there's a pretty good reason NOT to support that, namely that this > breaks relational handling of tables. PostgreSQL is a *relational* > database system, hence it's preferable for structures to be > relational, as opposed to hierarchical, which is what any of the > suggested nestings are. I won't argue with whether or not nested naming is a good idea, but I will argue with your other comment about breaking relational handling. A relational database is a database in which all data is kept in relation-typed variables, which SQL calls tables, and you can perform all queries and updates with just relation-valued expressions and statements. Organizing the tables into a multi-level namespace, either fixed-depth or variable-depth, rather than using a flat namespace, does not make the database any less relational, because the above definition and any others still hold. The "less relational" argument above is a red herring or distraction. One can argue against namespace nesting just fine without saying that. -- Darren Duncan
Christopher Browne wrote: > Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1", > there's a pretty good reason NOT to support that, namely that this > breaks relational handling of tables. PostgreSQL is a *relational* > database system, hence it's preferable for structures to be > relational, as opposed to hierarchical, which is what any of the > suggested nestings are. Organizing the tables into a multi-level namespace, either fixed-depth or variable-depth, rather than using a flat namespace,does not make the database any less relational, because the above definition and any others still hold. The "less relational" argument above is a red herring or distraction. One can argue against namespace nesting just finewithout saying that. -- Darren Duncan >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with Darren. One thought that came to my mind was to use a different separator between two namespaces and/or between the database identifierand the rest of the "path". Examples: ns1!ns2.table OR database@ns1.table OR database@ns1!ns2.table I've been following only some of the discussion but it seems that much ambiguity would be lost by using different separators. Schemas themselves are already non-standard so it isn't like we are constrained here in what is chosen. Just some quick thoughts I've had but haven't fully considered how they would fit in to the existing setup. But is thereis any major reason why choosing different separators would not work? Also, within search_path, some form of wild-card selector would be desirable: ns1!*. I'm not opposed to having to be explicitabout the search_path in order to avoid name collisions; though it would be nice if VIEWS had some kind of "SET"syntax, like functions do, so that the definer can specify the search_path that the view will resolve against. David J.
On Jul8, 2011, at 08:21 , Darren Duncan wrote: > Also, the proper way to do temporary tables would be to put them in > another database than the main one, where the whole other database > has the property of being temporary. FWIW, Microsoft SQL Server does it that way, and as a result temporary tables are severely restricted in a number of ways. For example, custom datatypes defined in a non-temporary database cannot be used in temporary table definitions, because datatypes may only be used within the database they're defined in. You can of course re-define the data type in the temporary database, but then obviously have to do so every time you start new session because you start out with an empty tempdb. best regards, Florian Pflug
On Jul11, 2011, at 07:08 , Darren Duncan wrote: > Christopher Browne wrote: >> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1", >> there's a pretty good reason NOT to support that, namely that this >> breaks relational handling of tables. PostgreSQL is a *relational* >> database system, hence it's preferable for structures to be >> relational, as opposed to hierarchical, which is what any of the >> suggested nestings are. > > A relational database is a database in which all data is kept in > relation-typed variables, which SQL calls tables, and you can perform > all queries and updates with just relation-valued expressions and > statements. > > Organizing the tables into a multi-level namespace, either fixed-depth > or variable-depth, rather than using a flat namespace, does not make > the database any less relational, because the above definition and > any others still hold. The point was not, I think, that tables aren't suddenly relations once namespaces are nested, but that the data model of the dbms *itself*, i.e. the data model that defines the relationship between namespaces, types, columns, type, ... becomes harder to map to the relational model. For example, if namespaces can be nested, you'll need to resort to recursive SQL and/or arrays far more often if you inspect the structure of a database. Btw, another argument against nested namespaces is that it actually doesn't buy you anything in SQL, even if you solve the parsing ambiguities. In programming languages, namespaces not only prevent name clashes, the also defines the possible scopes to resolve unqualified names with. For example, if you do void f() { printf("outer"); } namespace a { void f() { printf("inner"); } namespace b { void g() {f();} } } in C++, then a::b::g() prints "inner". But in PostgreSQL, the scope in which to resolve unqualified function is entirely determined by the the search_path setting, *not* by the scope of the object containing the unqualified name. Nested namespaces thus simply become of matter of syntax - i.e., whether you can write a.b.c, or need to write "a.b".c. best regards, Florian Pflug
On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug <fgp@phlo.org> wrote: > On Jul11, 2011, at 07:08 , Darren Duncan wrote: >> Christopher Browne wrote: >>> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1", >>> there's a pretty good reason NOT to support that, namely that this >>> breaks relational handling of tables. PostgreSQL is a *relational* >>> database system, hence it's preferable for structures to be >>> relational, as opposed to hierarchical, which is what any of the >>> suggested nestings are. >> >> A relational database is a database in which all data is kept in >> relation-typed variables, which SQL calls tables, and you can perform >> all queries and updates with just relation-valued expressions and >> statements. >> >> Organizing the tables into a multi-level namespace, either fixed-depth >> or variable-depth, rather than using a flat namespace, does not make >> the database any less relational, because the above definition and >> any others still hold. > > The point was not, I think, that tables aren't suddenly relations once > namespaces are nested, but that the data model of the dbms *itself*, > i.e. the data model that defines the relationship between namespaces, > types, columns, type, ... becomes harder to map to the relational model. Just so. It's not that it suddenly "becomes no longer relational". Rather, the argument is that "it was intentional for the structuring of table naming to, itself, be relational," and changing that definitely has some undesirable characteristics. The need for recursive queries is the most obvious "undesirable", but it's not the only undesirable thing, by any means. Sure, there's some cool stuff that we can get out of nested namespaces, but I think we'd pay a pretty big price for it, and it shouldn't be treated as "obvious" that: a) It's a good thing to do so, b) It is desirable to do so, c) There will be agreement to do so. To the contrary, there are pretty good reasons to reject the idea. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug <fgp@phlo.org> wrote: > On Jul11, 2011, at 07:08 , Darren Duncan wrote: >> Christopher Browne wrote: >>> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1", >>> there's a pretty good reason NOT to support that, namely that this >>> breaks relational handling of tables. PostgreSQL is a *relational* >>> database system, hence it's preferable for structures to be >>> relational, as opposed to hierarchical, which is what any of the >>> suggested nestings are. >>Rather, the argument is that "it was intentional for the structuring of table naming to, itself, be relational," and changing that definitely has some undesirable characteristics. >>The need for recursive queries is the most obvious "undesirable", but it's not the only undesirable thing, by any means. I do not see how recursive queries (really iteration of records) even enters the picture... Right now I can emulate a hierarchical schema structure via a naming scheme - for example "schemabase_sub1_sub2_etc". I am simply looking for a formal way to do the above AND also tell the system that I want all schemas under "schemabase" to be in the search path. Heck, I guess just allowing for simply pattern matching in "search_path" would be useful in this case regardless of the presence of an actual schema hierarchy. Using "LIKE" syntax say: "SET search_path TO schemabase_sub1_%" or something similar. The only missing ability becomes a way for graphical tools to represent the schema "hierarchy" using a tree-structure with multiple depths. I can see how adding "." and ".." and relative paths would confuse the issue those are not necessary features of a multi-level schema depth. The above, combined with a different separator for intra-level namespace/schema delineation, would allow for an unambiguous way to define and use a hierarchical schema with seemingly minimal invasion into the current way of doing things. You could almost implement it just by requiring a specific character to act as the separator and then construct the actual schema using single-level literals and supporting functions that can convert them into an hierarchy. In other words, the schema table would still only contain one field with the full "parent!child" as opposed to (schema, parent) with (VALUES('parent',null),('child','parent')). In other words, if we use "!" as the separator, any schema named "parent!child" could be stored and referenced as such but then if you run a "getChildren(parent)" function it would return "child" along with any other schemas of the form "parent!%". In this case the "%" sign could maybe only match everything except "!" and the "*" symbol could be used to match "!" as well. I could give more examples but I hope the basic idea is obvious. The main thing is that the namespace hierarchy usage is standardized in such a way that pgAdmin and other GUI tools can reliably use for display purposes and that "search_path" can be constructed in a more compact format so that every schema and sub-schema is still absolutely referenced (you can even have the SET command resolve search_path at execution time and then remain static just like "CREATE VIEW SELECT * FROM table". David J.
I will put my support for David Johnston's proposal, in principle, though minor details of syntax could be changed if using "!" conflicts with something. -- Darren Duncan David Johnston wrote: > On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug <fgp@phlo.org> wrote: >> On Jul11, 2011, at 07:08 , Darren Duncan wrote: >>> Christopher Browne wrote: >>>> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1", >>>> there's a pretty good reason NOT to support that, namely that this >>>> breaks relational handling of tables. PostgreSQL is a *relational* >>>> database system, hence it's preferable for structures to be >>>> relational, as opposed to hierarchical, which is what any of the >>>> suggested nestings are. > >>> Rather, the argument is that "it was intentional for the structuring of > table naming to, itself, be relational," and changing that definitely has > some undesirable characteristics. > >>> The need for recursive queries is the most obvious "undesirable", but it's > not the only undesirable thing, by any means. > > I do not see how recursive queries (really iteration of records) even enters > the picture... > > Right now I can emulate a hierarchical schema structure via a naming scheme > - for example "schemabase_sub1_sub2_etc". I am simply looking for a formal > way to do the above AND also tell the system that I want all schemas under > "schemabase" to be in the search path. Heck, I guess just allowing for > simply pattern matching in "search_path" would be useful in this case > regardless of the presence of an actual schema hierarchy. Using "LIKE" > syntax say: "SET search_path TO schemabase_sub1_%" or something similar. > The only missing ability becomes a way for graphical tools to represent the > schema "hierarchy" using a tree-structure with multiple depths. > > I can see how adding "." and ".." and relative paths would confuse the issue > those are not necessary features of a multi-level schema depth. > > The above, combined with a different separator for intra-level > namespace/schema delineation, would allow for an unambiguous way to define > and use a hierarchical schema with seemingly minimal invasion into the > current way of doing things. You could almost implement it just by requiring > a specific character to act as the separator and then construct the actual > schema using single-level literals and supporting functions that can convert > them into an hierarchy. In other words, the schema table would still only > contain one field with the full "parent!child" as opposed to (schema, > parent) with (VALUES('parent',null),('child','parent')). > > In other words, if we use "!" as the separator, any schema named > "parent!child" could be stored and referenced as such but then if you run a > "getChildren(parent)" function it would return "child" along with any other > schemas of the form "parent!%". In this case the "%" sign could maybe only > match everything except "!" and the "*" symbol could be used to match "!" as > well. > > I could give more examples but I hope the basic idea is obvious. The main > thing is that the namespace hierarchy usage is standardized in such a way > that pgAdmin and other GUI tools can reliably use for display purposes and > that "search_path" can be constructed in a more compact format so that every > schema and sub-schema is still absolutely referenced (you can even have the > SET command resolve search_path at execution time and then remain static > just like "CREATE VIEW SELECT * FROM table". > > David J. > > > >
On Mon, Jul 11, 2011 at 12:49 PM, David Johnston <polobo@yahoo.com> wrote: > I do not see how recursive queries (really iteration of records) even enters > the picture... I agree, FWIW. If the feature was that desirable, we could look at questions of implementation to make recursion either unnecessary or at least well managed. > > Right now I can emulate a hierarchical schema structure via a naming scheme > - for example "schemabase_sub1_sub2_etc". I am simply looking for a formal > way to do the above AND also tell the system that I want all schemas under > "schemabase" to be in the search path. Heck, I guess just allowing for > simply pattern matching in "search_path" would be useful in this case > regardless of the presence of an actual schema hierarchy. Using "LIKE" > syntax say: "SET search_path TO schemabase_sub1_%" or something similar. > The only missing ability becomes a way for graphical tools to represent the > schema "hierarchy" using a tree-structure with multiple depths. Right. Semantically myapp_schemaname_subschemaname is no less hierarchical than myapp.schemaname.subschemaname. The larger issue is that of potential ambiguity wrt cross-database references (I don't have a lot of experience reading the SQL standards, but seeing how different db's implement cross-db references suggests that the standards contemplate semantic meaning to depth of the namespace). > > I can see how adding "." and ".." and relative paths would confuse the issue > those are not necessary features of a multi-level schema depth. > > The above, combined with a different separator for intra-level > namespace/schema delineation, would allow for an unambiguous way to define > and use a hierarchical schema with seemingly minimal invasion into the > current way of doing things. You could almost implement it just by requiring > a specific character to act as the separator and then construct the actual > schema using single-level literals and supporting functions that can convert > them into an hierarchy. In other words, the schema table would still only > contain one field with the full "parent!child" as opposed to (schema, > parent) with (VALUES('parent',null),('child','parent')). > > In other words, if we use "!" as the separator, any schema named > "parent!child" could be stored and referenced as such but then if you run a > "getChildren(parent)" function it would return "child" along with any other > schemas of the form "parent!%". In this case the "%" sign could maybe only > match everything except "!" and the "*" symbol could be used to match "!" as > well. Agreed that this would be helpful. I would personally have a lot of use for this sort of feature, particularly with managing large numbers of stored procedures. Right now I am using a double underscore which is error-prone. Best Wishes, Chris Travers
On Jul11, 2011, at 21:49 , David Johnston wrote: > Right now I can emulate a hierarchical schema structure via a naming scheme > - for example "schemabase_sub1_sub2_etc". I am simply looking for a formal > way to do the above AND also tell the system that I want all schemas under > "schemabase" to be in the search path. Heck, I guess just allowing for > simply pattern matching in "search_path" would be useful in this case > regardless of the presence of an actual schema hierarchy. Using "LIKE" > syntax say: "SET search_path TO schemabase_sub1_%" or something similar. create function set_searchpath_expand(v_pattern text) returns void as $$ declare v_searchpath text; begin select string_agg(quote_ident(nspname), ',') into v_searchpath from pg_catalog.pg_namespace where nspname like v_pattern; execute 'set search_path = ' || v_searchpath; end $$ language plpgsql; best regards, Florian Pflug
On 12 Jul 2011, at 3:25, Chris Travers wrote: >> Right now I can emulate a hierarchical schema structure via a naming scheme >> - for example "schemabase_sub1_sub2_etc". I am simply looking for a formal >> way to do the above AND also tell the system that I want all schemas under >> "schemabase" to be in the search path. Heck, I guess just allowing for >> simply pattern matching in "search_path" would be useful in this case >> regardless of the presence of an actual schema hierarchy. Using "LIKE" >> syntax say: "SET search_path TO schemabase_sub1_%" or something similar. >> The only missing ability becomes a way for graphical tools to represent the >> schema "hierarchy" using a tree-structure with multiple depths. > > Right. Semantically myapp_schemaname_subschemaname is no less > hierarchical than myapp.schemaname.subschemaname. It is. To the database, your first example is a single identifier, while your second example is a sequential combinationof three identifiers. The second one contains explicit hierarchy, the first one does not. It's quite possible that the fact that these identifiers have a sequence is the biggest problem for implementing this ina relational database. Relational databases work with sets after all, which have no explicit sequence. With the introductionof recursive queries that's _possible_, but as claimed earlier (and I tend to agree), for performance reasonsit is undesirable to apply this to system tables. If we were talking about a _set_ of identifiers instead, without the requirement of a hierarchy (eg. myapp.schemaname.subschemaname= subschemaname.myapp.schemaname), implementation would probably be easier/perform better. That does have some interesting implications for incompletely specified sets of namespaces, I'm not sure how desirable theyare. What's cool is that you can specify just a server hostname and a table-name and (as long as there's no ambiguity) that'ssufficient. Not so cool, if you use the above and someone clones the database on said host, you've suddenly introduced ambiguity. Problems like that are likely to turn up with incomplete identifier specifications though, just a bit more likely to happenif you take the meaning of the sequence of the identifiers out. Just a bit. > The larger issue is that of potential ambiguity wrt cross-database references. Not necessarily, if the reference to a remote database is unambiguously recognisable as such, for example by using some kindof URI notation (eg. dsn://user@remote-database1). I'm also wondering how to handle this for multi-master replicated environments, in view of load-balancing. Those remote databasereferences probably need to reference different databases depending on which master they're running on? From a security point-of-view I'd probably require a list of accessible remote databases per server (so that people cannotjust query any database of their choice). That could also serve the load-balancing scenario. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4e1c1e1012091390850944!