Thread: Creating temp tables inside read only transactions

Creating temp tables inside read only transactions

From
mike beeper
Date:
I have a function that creates a temp table, populate it with results during intermediate processing, and reads from it at the end.  When the transaction is marked as read only, it does not allow creation of temp table, even though there are no permanent writes to the db.  Are there any workarounds? The following block errors out.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED READ ONLY;
create temp table test(test int);

Re: Creating temp tables inside read only transactions

From
Guillaume Lelarge
Date:
On Thu, 2011-07-07 at 16:01 +0000, mike beeper wrote:
> I have a function that creates a temp table, populate it with results
> during intermediate processing, and reads from it at the end.  When
> the transaction is marked as read only, it does not allow creation of
> temp table, even though there are no permanent writes to the db.  Are
> there any workarounds? The following block errors out.
>
> SET TRANSACTION ISOLATION LEVEL READ COMMITTED READ ONLY;
> create temp table test(test int);
>

When you create a temporary table, PostgreSQL needs to add rows in
pg_class, pg_attribute, and probably other system catalogs. So there are
writes, which aren't possible in a read-only transaction. Hence the
error. And no, there is no workaround.


--
Guillaume
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


Re: Creating temp tables inside read only transactions

From
Darren Duncan
Date:
Guillaume Lelarge wrote [on pgsql-general]:
> On Thu, 2011-07-07 at 16:01 +0000, mike beeper wrote [on pgsql-general]:
>> I have a function that creates a temp table, populate it with results
>> during intermediate processing, and reads from it at the end.  When
>> the transaction is marked as read only, it does not allow creation of
>> temp table, even though there are no permanent writes to the db.  Are
>> there any workarounds? The following block errors out.
>>
>> SET TRANSACTION ISOLATION LEVEL READ COMMITTED READ ONLY;
>> create temp table test(test int);
>
> When you create a temporary table, PostgreSQL needs to add rows in
> pg_class, pg_attribute, and probably other system catalogs. So there are
> writes, which aren't possible in a read-only transaction. Hence the
> error. And no, there is no workaround.

That sounds like a deficiency to overcome.

It should be possible for those system catalogs to be virtual, defined like
union views over similar immutable tables for the read-only database plus
mutable in-memory ones for the temporary tables.

Are there any plans in the works to do this?

On the other hand, if one can have lexical-scope tables (table-typed routine
variables), and I know Pg 8.4+ has named subqueries which handle a lot of cases
where temp tables would otherwise be used, I would certainly expect those to
work when you're dealing with a readonly database.

-- Darren Duncan

Re: [HACKERS] Creating temp tables inside read only transactions

From
Jeff Davis
Date:
On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
> > When you create a temporary table, PostgreSQL needs to add rows in
> > pg_class, pg_attribute, and probably other system catalogs. So there are
> > writes, which aren't possible in a read-only transaction. Hence the
> > error. And no, there is no workaround.
>
> That sounds like a deficiency to overcome.
>
> It should be possible for those system catalogs to be virtual, defined like
> union views over similar immutable tables for the read-only database plus
> mutable in-memory ones for the temporary tables.

Ideally, yes, from a logical standpoint there are catalog entries that
are only interesting to one backend.

But that doesn't mean it's easy to do. Remember that catalog lookups
(even though most go through a cache) are a path that is important to
performance. Also, more complex catalog interpretations may introduce
some extra bootstrapping challenges.

> Are there any plans in the works to do this?

I don't think so. It sounds like some fairly major work for a
comparatively minor benefit.

Suggestions welcome, of course, to either make the work look more minor
or the benefits look more major ;)

Regards,
    Jeff Davis


Re: [HACKERS] Creating temp tables inside read only transactions

From
Darren Duncan
Date:
Jeff Davis wrote:
> On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
>>> When you create a temporary table, PostgreSQL needs to add rows in
>>> pg_class, pg_attribute, and probably other system catalogs. So there are
>>> writes, which aren't possible in a read-only transaction. Hence the
>>> error. And no, there is no workaround.
>> That sounds like a deficiency to overcome.
>>
>> It should be possible for those system catalogs to be virtual, defined like
>> union views over similar immutable tables for the read-only database plus
>> mutable in-memory ones for the temporary tables.
>
> Ideally, yes, from a logical standpoint there are catalog entries that
> are only interesting to one backend.
>
> But that doesn't mean it's easy to do. Remember that catalog lookups
> (even though most go through a cache) are a path that is important to
> performance. Also, more complex catalog interpretations may introduce
> some extra bootstrapping challenges.
>
>> Are there any plans in the works to do this?
>
> I don't think so. It sounds like some fairly major work for a
> comparatively minor benefit.
>
> Suggestions welcome, of course, to either make the work look more minor
> or the benefits look more major ;)

What I said before was a simplification; below I present my real proposal.

I think an even better way to support this is would be based on Postgres having
support for directly using multiple databases within the same SQL session at
once, as if namespaces were another level deep, the first level being the
databases, the second level the schemas, and the third level the schema objects.

Kind of like what the SQL standard defines its catalog/schema/object namespaces.

This instead of needing to use federating or that contrib module to use multiple
Pg databases of the same cluster at once.

Under this scenario, we make the property of a database being read-only or
read-write for the current SQL session associated with a database rather than
the whole SQL session.  A given transaction can read from any database but can
only make changes to the ones not read-only.

Also, the proper way to do temporary tables would be to put them in another
database than the main one, where the whole other database has the property of
being temporary.

Under this scenario, there would be separate system catalogs for each database,
and so the ones for read-only databases are read-only, and the ones for other
databases aren't.

Then the system catalog itself fundamentally isn't more complicated, per
database, and anything extra to handle cross-database queries or whatever, if
anything, is a separate layer.  Code that only deals with a single database at
once would be an optimized situation and perform no worse than it does now.

Furthermore, federating databases is done with the same interface, by adding
remote/foreign databases as extra databases at the top level namespace.

Fundamentally, a SQL session would be associated with a Pg server, not a
database managed by such.  When one starts a SQL session, there are initially no
databases visible to them, and the top-level namespace is empty.

They then "mount" a database, similarly to how one mounts an OS filesystem, by
providing appropriate connection info, either just the database name or also
user/pass or also remote host etc as is applicable, these details being the
difference between using a local/same-Pg-cluster db or a remote/federated one,
and the details also say whether it is temporary or initially read-only etc.

See also how SQLite works; this "mount" being analogous to their "attach".

Such a paradigm is also how my Muldis D language interfaces databases; this is
the most flexible, portable, extensible, optimizable, and elegant approach I can
think of.

-- Darren Duncan

Re: [HACKERS] Creating temp tables inside read only transactions

From
Jeff Davis
Date:
On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote:
> I think an even better way to support this is would be based on Postgres having
> support for directly using multiple databases within the same SQL session at
> once, as if namespaces were another level deep, the first level being the
> databases, the second level the schemas, and the third level the schema objects.
>
> Kind of like what the SQL standard defines its catalog/schema/object namespaces.
>
> This instead of needing to use federating or that contrib module to use multiple
> Pg databases of the same cluster at once.
>
> Under this scenario, we make the property of a database being read-only or
> read-write for the current SQL session associated with a database rather than
> the whole SQL session.  A given transaction can read from any database but can
> only make changes to the ones not read-only.
>
> Also, the proper way to do temporary tables would be to put them in another
> database than the main one, where the whole other database has the property of
> being temporary.
>
> Under this scenario, there would be separate system catalogs for each database,
> and so the ones for read-only databases are read-only, and the ones for other
> databases aren't.
>
> Then the system catalog itself fundamentally isn't more complicated, per
> database, and anything extra to handle cross-database queries or whatever, if
> anything, is a separate layer.  Code that only deals with a single database at
> once would be an optimized situation and perform no worse than it does now.

One challenge that jumps to mind here is that an Oid would need to
become a pair (catalog, oid). Even if the end result isn't much more
complex, getting there is not trivial.

> See also how SQLite works; this "mount" being analogous to their "attach".

I'm not sure SQLite is the best example. It has a radically different
architecture.

Regards,
    Jeff Davis


Re: [HACKERS] Creating temp tables inside read only transactions

From
mike beeper
Date:
I like Darren's proposal.  It is elegant.

> Date: Fri, 8 Jul 2011 18:38:59 +1200
> From: GavinFlower@archidevsys.co.nz
> To: darren@darrenduncan.net
> CC: pgsql@j-davis.com; guillaume@lelarge.info; mbeeper@hotmail.com; pgsql-general@postgresql.org; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] [GENERAL] Creating temp tables inside read only transactions
>
> On 08/07/11 18:21, Darren Duncan wrote:
> > Jeff Davis wrote:
> >> On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
> >>>> When you create a temporary table, PostgreSQL needs to add rows in
> >>>> pg_class, pg_attribute, and probably other system catalogs. So
> >>>> there are
> >>>> writes, which aren't possible in a read-only transaction. Hence the
> >>>> error. And no, there is no workaround.
> >>> That sounds like a deficiency to overcome.
> >>>
> >>> It should be possible for those system catalogs to be virtual,
> >>> defined like union views over similar immutable tables for the
> >>> read-only database plus mutable in-memory ones for the temporary
> >>> tables.
> >>
> >> Ideally, yes, from a logical standpoint there are catalog entries that
> >> are only interesting to one backend.
> >>
> >> But that doesn't mean it's easy to do. Remember that catalog lookups
> >> (even though most go through a cache) are a path that is important to
> >> performance. Also, more complex catalog interpretations may introduce
> >> some extra bootstrapping challenges.
> >>
> >>> Are there any plans in the works to do this?
> >>
> >> I don't think so. It sounds like some fairly major work for a
> >> comparatively minor benefit.
> >>
> >> Suggestions welcome, of course, to either make the work look more minor
> >> or the benefits look more major ;)
> >
> > What I said before was a simplification; below I present my real
> > proposal.
> >
> > I think an even better way to support this is would be based on
> > Postgres having support for directly using multiple databases within
> > the same SQL session at once, as if namespaces were another level
> > deep, the first level being the databases, the second level the
> > schemas, and the third level the schema objects.
> >
> > Kind of like what the SQL standard defines its catalog/schema/object
> > namespaces.
> >
> > This instead of needing to use federating or that contrib module to
> > use multiple Pg databases of the same cluster at once.
> >
> > Under this scenario, we make the property of a database being
> > read-only or read-write for the current SQL session associated with a
> > database rather than the whole SQL session. A given transaction can
> > read from any database but can only make changes to the ones not
> > read-only.
> >
> > Also, the proper way to do temporary tables would be to put them in
> > another database than the main one, where the whole other database has
> > the property of being temporary.
> >
> > Under this scenario, there would be separate system catalogs for each
> > database, and so the ones for read-only databases are read-only, and
> > the ones for other databases aren't.
> >
> > Then the system catalog itself fundamentally isn't more complicated,
> > per database, and anything extra to handle cross-database queries or
> > whatever, if anything, is a separate layer. Code that only deals with
> > a single database at once would be an optimized situation and perform
> > no worse than it does now.
> >
> > Furthermore, federating databases is done with the same interface, by
> > adding remote/foreign databases as extra databases at the top level
> > namespace.
> >
> > Fundamentally, a SQL session would be associated with a Pg server, not
> > a database managed by such. When one starts a SQL session, there are
> > initially no databases visible to them, and the top-level namespace is
> > empty.
> >
> > They then "mount" a database, similarly to how one mounts an OS
> > filesystem, by providing appropriate connection info, either just the
> > database name or also user/pass or also remote host etc as is
> > applicable, these details being the difference between using a
> > local/same-Pg-cluster db or a remote/federated one, and the details
> > also say whether it is temporary or initially read-only etc.
> >
> > See also how SQLite works; this "mount" being analogous to their
> > "attach".
> >
> > Such a paradigm is also how my Muldis D language interfaces databases;
> > this is the most flexible, portable, extensible, optimizable, and
> > elegant approach I can think of.
> >
> > -- Darren Duncan
> >
> I would suggest that the default action for psql would be as now,
> associate the session with a database in the name of the current O/S user.
>
> However, use a new psql flag, such as '-unattached' or '-N', to indicate
> that no database is to be attached when psql starts up.
>
> While I don't have a current need for what you propose, it does look
> interesting and potentially useful to me.
>

Re: [HACKERS] Creating temp tables inside read only transactions

From
Gavin Flower
Date:
On 08/07/11 18:21, Darren Duncan wrote:
> Jeff Davis wrote:
>> On Thu, 2011-07-07 at 20:56 -0700, Darren Duncan wrote:
>>>> When you create a temporary table, PostgreSQL needs to add rows in
>>>> pg_class, pg_attribute, and probably other system catalogs. So
>>>> there are
>>>> writes, which aren't possible in a read-only transaction. Hence the
>>>> error. And no, there is no workaround.
>>> That sounds like a deficiency to overcome.
>>>
>>> It should be possible for those system catalogs to be virtual,
>>> defined like union views over similar immutable tables for the
>>> read-only database plus mutable in-memory ones for the temporary
>>> tables.
>>
>> Ideally, yes, from a logical standpoint there are catalog entries that
>> are only interesting to one backend.
>>
>> But that doesn't mean it's easy to do. Remember that catalog lookups
>> (even though most go through a cache) are a path that is important to
>> performance. Also, more complex catalog interpretations may introduce
>> some extra bootstrapping challenges.
>>
>>> Are there any plans in the works to do this?
>>
>> I don't think so. It sounds like some fairly major work for a
>> comparatively minor benefit.
>>
>> Suggestions welcome, of course, to either make the work look more minor
>> or the benefits look more major ;)
>
> What I said before was a simplification; below I present my real
> proposal.
>
> I think an even better way to support this is would be based on
> Postgres having support for directly using multiple databases within
> the same SQL session at once, as if namespaces were another level
> deep, the first level being the databases, the second level the
> schemas, and the third level the schema objects.
>
> Kind of like what the SQL standard defines its catalog/schema/object
> namespaces.
>
> This instead of needing to use federating or that contrib module to
> use multiple Pg databases of the same cluster at once.
>
> Under this scenario, we make the property of a database being
> read-only or read-write for the current SQL session associated with a
> database rather than the whole SQL session.  A given transaction can
> read from any database but can only make changes to the ones not
> read-only.
>
> Also, the proper way to do temporary tables would be to put them in
> another database than the main one, where the whole other database has
> the property of being temporary.
>
> Under this scenario, there would be separate system catalogs for each
> database, and so the ones for read-only databases are read-only, and
> the ones for other databases aren't.
>
> Then the system catalog itself fundamentally isn't more complicated,
> per database, and anything extra to handle cross-database queries or
> whatever, if anything, is a separate layer.  Code that only deals with
> a single database at once would be an optimized situation and perform
> no worse than it does now.
>
> Furthermore, federating databases is done with the same interface, by
> adding remote/foreign databases as extra databases at the top level
> namespace.
>
> Fundamentally, a SQL session would be associated with a Pg server, not
> a database managed by such.  When one starts a SQL session, there are
> initially no databases visible to them, and the top-level namespace is
> empty.
>
> They then "mount" a database, similarly to how one mounts an OS
> filesystem, by providing appropriate connection info, either just the
> database name or also user/pass or also remote host etc as is
> applicable, these details being the difference between using a
> local/same-Pg-cluster db or a remote/federated one, and the details
> also say whether it is temporary or initially read-only etc.
>
> See also how SQLite works; this "mount" being analogous to their
> "attach".
>
> Such a paradigm is also how my Muldis D language interfaces databases;
> this is the most flexible, portable, extensible, optimizable, and
> elegant approach I can think of.
>
> -- Darren Duncan
>
I would suggest that the default action for psql would be as now,
associate the session with a database in the name of the current O/S user.

However, use a new psql flag, such as '-unattached' or '-N', to indicate
that no database is to be attached when psql starts up.

While I don't have a current need for what you propose, it does look
interesting and potentially useful to me.


Re: [HACKERS] Creating temp tables inside read only transactions

From
Darren Duncan
Date:
Jeff Davis wrote:
> On Thu, 2011-07-07 at 23:21 -0700, Darren Duncan wrote:
>> I think an even better way to support this is would be based on Postgres having
>> support for directly using multiple databases within the same SQL session at
>> once, as if namespaces were another level deep, the first level being the
>> databases, the second level the schemas, and the third level the schema objects.
<snip>
>> Then the system catalog itself fundamentally isn't more complicated, per
>> database, and anything extra to handle cross-database queries or whatever, if
>> anything, is a separate layer.  Code that only deals with a single database at
>> once would be an optimized situation and perform no worse than it does now.
>
> One challenge that jumps to mind here is that an Oid would need to
> become a pair (catalog, oid). Even if the end result isn't much more
> complex, getting there is not trivial.

Yes, but that would just be in-memory or in temporary places external to every
database.  On disk internal to a database there would just be the oid.  In fact,
another aspect of the database model I defined is that each "database" is
entirely self-contained; while you can do cross-database queries, you don't have
cross-database constraints, in the general case.

>> See also how SQLite works; this "mount" being analogous to their "attach".
>
> I'm not sure SQLite is the best example. It has a radically different
> architecture.

Still, its an example I know of where you can access several clearly separable
databases at once through a common namespace.  While one might argue this is a
substitute for multiple schema support, I don't because with multiple schemas
you can have integrity constraints that cross schemas.  The namespaces issue is
largely orthogonal to self-containment or integrity in my model.

But look at Oracle too, at least how I understand it.

Oracle supports "CONNECT TO ... AUTHORIZE ..."/etc SQL, meaning you can define
what databases you are accessing within the SQL session, rather than having to
do it externally.  I assume that Oracle's features correspond somewhat to my
proposal, and so enable cross-database queries in the illusion that several
databases are one.

Suffice it to say, I have thought through my proposed model for years, with one
of its (and Muldis D's) express purposes in providing a common normalized
paradigm that all the existing SQL DBMSs can map to with consistent behavior
whether Oracle or SQLite, and I haven't stated all of it here (a lot more is in
my published language spec).  Key mapping points are the boundaries of a
database's self-definability.  And namespace nesting is actually
arbitrary-depth, so accounting for everything from no native schema support to
schema plus "package" namespace support.

-- Darren Duncan

Re: [HACKERS] Creating temp tables inside read only transactions

From
Robert Haas
Date:
On Fri, Jul 8, 2011 at 2:21 AM, Darren Duncan <darren@darrenduncan.net> wrote:
> I think an even better way to support this is would be based on Postgres
> having support for directly using multiple databases within the same SQL
> session at once, as if namespaces were another level deep, the first level
> being the databases, the second level the schemas, and the third level the
> schema objects.
>
> Kind of like what the SQL standard defines its catalog/schema/object
> namespaces.
>
> This instead of needing to use federating or that contrib module to use
> multiple Pg databases of the same cluster at once.

But if that's what you want, just don't put your data in different
databases in the first place.  That's what schemas are for.

If for some reason we needed to have tables that happened to be called
x.y.z and a.b.c accessible from a single SQL session, we could allow
that much more simply by allowing schemas to be nested.  Then we could
allow arbitrary numbers of levels, not just three.  The whole point of
having databases and schemas as separate objects is that they do
different things: schemas are just containers for names, allowing
common access to data, and databases are completely separate entities,
allowing privilege separation for (say) a multi-tenant hosting
environment.  We're not going to throw out the latter concept just so
people can use two dots in their table names instead of one.

> Under this scenario, we make the property of a database being read-only or
> read-write for the current SQL session associated with a database rather
> than the whole SQL session.  A given transaction can read from any database
> but can only make changes to the ones not read-only.
>
> Also, the proper way to do temporary tables would be to put them in another
> database than the main one, where the whole other database has the property
> of being temporary.
>
> Under this scenario, there would be separate system catalogs for each
> database, and so the ones for read-only databases are read-only, and the
> ones for other databases aren't.
>
> Then the system catalog itself fundamentally isn't more complicated, per
> database, and anything extra to handle cross-database queries or whatever,
> if anything, is a separate layer.  Code that only deals with a single
> database at once would be an optimized situation and perform no worse than
> it does now.

I think you should make more of an effort to understand how the system
works now, and why, before proposing radical redesigns.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Creating temp tables inside read only transactions

From
Darren Duncan
Date:
Robert Haas wrote:
> But if that's what you want, just don't put your data in different
> databases in the first place.  That's what schemas are for.
>
> If for some reason we needed to have tables that happened to be called
> x.y.z and a.b.c accessible from a single SQL session, we could allow
> that much more simply by allowing schemas to be nested.  Then we could
> allow arbitrary numbers of levels, not just three.  The whole point of
> having databases and schemas as separate objects is that they do
> different things: schemas are just containers for names, allowing
> common access to data, and databases are completely separate entities,
> allowing privilege separation for (say) a multi-tenant hosting
> environment.  We're not going to throw out the latter concept just so
> people can use two dots in their table names instead of one.

I agree with what you're saying in general and that schema namespaces should be
nestable to arbitrary levels.  One dot or two isn't an issue I have.

Dividing based on databases or on schemas is a big and important distinction.

I see that the semantic purpose of using multiple databases is to allow things
to be completely independent and self-defined, where one can understand the
meaning of any one database in isolation.  So one can take each of the 2
databases and walk off with them in opposite directions, and each can still be
used and understood.

Whereas, schemas are namespaces for organizing entities within a single database
where any of those entities may be interdependent, such as defining a data type
in one schema and using it as the declared type with a routine or table or
constraint in another.

But just because you use multiple databases in order for them to be independent,
sometimes one still wants to use them together, and an abstraction loosely like
federating is useful here.

> I think you should make more of an effort to understand how the system
> works now, and why, before proposing radical redesigns.

Well yes, of course.  But that will take time and I think I already understand
enough about it to make some useful contributions in the meantime.  How much or
what I already know may not always come across well.  If this bothers people
then I can make more of an effort to reduce my input until I have more solid
things to back them up.

-- Darren Duncan

Re: [HACKERS] Creating temp tables inside read only transactions

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> If for some reason we needed to have tables that happened to be called
> x.y.z and a.b.c accessible from a single SQL session, we could allow
> that much more simply by allowing schemas to be nested.  Then we could
> allow arbitrary numbers of levels, not just three.

FWIW, I actually tried to do that back when we first introduced schema
support (the fact that the code calls them namespaces and not schemas
is a leftover from that idea).  It turns out to be a whole lot harder
than it sounds, because of the ambiguity you get about which name goes
at what level.  A simple example of this is: if you write "x.y" in a
query, is that meant to be table x's column y, or is it meant to be
field y within a composite column x of some table in the query?
We've resolved that by requiring you to write "(x).y" when you mean
the latter, but it's not exactly an intuitive or pleasant answer.
In the same way, if namespaces can be nested to different levels,
it gets really messy to support abbreviations of any sort --- but
the SQL spec requires us to be able to do so.

            regards, tom lane

Re: [HACKERS] Creating temp tables inside read only transactions

From
Jeff Davis
Date:
On Fri, 2011-07-08 at 12:34 -0700, Darren Duncan wrote:
> Yes, but that would just be in-memory or in temporary places external to every
> database.  On disk internal to a database there would just be the oid.  In fact,
> another aspect of the database model I defined is that each "database" is
> entirely self-contained; while you can do cross-database queries, you don't have
> cross-database constraints, in the general case.

Yes, you can have a "local oid" and a "fully-qualified oid". It sounds
like it might take some effort (which is an understatement) to go
through the system and figure out which ones should be local and which
ones should be fully-qualified.

Regards,
    Jeff Davis


Re: [HACKERS] Creating temp tables inside read only transactions

From
Darren Duncan
Date:
Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> If for some reason we needed to have tables that happened to be called
>> x.y.z and a.b.c accessible from a single SQL session, we could allow
>> that much more simply by allowing schemas to be nested.  Then we could
>> allow arbitrary numbers of levels, not just three.
>
> FWIW, I actually tried to do that back when we first introduced schema
> support (the fact that the code calls them namespaces and not schemas
> is a leftover from that idea).  It turns out to be a whole lot harder
> than it sounds, because of the ambiguity you get about which name goes
> at what level.  A simple example of this is: if you write "x.y" in a
> query, is that meant to be table x's column y, or is it meant to be
> field y within a composite column x of some table in the query?
> We've resolved that by requiring you to write "(x).y" when you mean
> the latter, but it's not exactly an intuitive or pleasant answer.
> In the same way, if namespaces can be nested to different levels,
> it gets really messy to support abbreviations of any sort --- but
> the SQL spec requires us to be able to do so.

What if you used the context of the calling code and resolve in favor of
whatever match is closest to it?  The problem is related to general-purpose
programming languages.

Basically start looking in the lexical context for an "x" and if you find one
use that; otherwise, assuming we're talking about referencing code that lives in
the database such as a function, look at the innermost schema containing the
referencing code and see if it has a direct child named "x"; otherwise go up one
level to a parent schema, and so on until you get to the top, and finding none
by then say it doesn't exist.

If there are several "x" in this search sequence, only use the first one
regardless of whether it has a "y", so to prevent bugs from too much complexity.
  Same for just looking for "x" by itself in fact, not just an "x.y".

For the case of calling code that doesn't live in the database such as a
client-side query, I believe there are session variables like "current schema"
or such, and you can use this as the starting point for the search for "x",
looking first at what that schema directly contains, and then its parent, and so on.

Something like that.

Or ignore what I said about starting in a lexical context and do what you
already do there, but keep what I said about relative order of schemas to
search, only searching direct children of ancestors of the current code's
context schema starting with the current context.

You could also come up with some "relative name" syntax such as filesystems
support with their ../ and such, but that's further from standard SQL.

-- Darren Duncan


Re: [HACKERS] Creating temp tables inside read only transactions

From
Jeff Davis
Date:
On Fri, 2011-07-08 at 21:04 -0700, Darren Duncan wrote:
> > I think you should make more of an effort to understand how the system
> > works now, and why, before proposing radical redesigns.
>
> Well yes, of course.  But that will take time and I think I already understand
> enough about it to make some useful contributions in the meantime.  How much or
> what I already know may not always come across well.  If this bothers people
> then I can make more of an effort to reduce my input until I have more solid
> things to back them up.

I don't think anyone expects you to understand all the internal APIs in
postgres before you make a proposal. But we do expect you to look
critically at your own proposals with the status quo (i.e. existing
code, users, and standards) in mind. And that probably means poking at
the code a little to see if you find stumbling blocks, and asking
questions to try to trace out the shape of the project.

I'm hoping that we can learn a lot from your work on Muldis D. In
particular, the type system might be the most fertile ground -- you've
clearly done some interesting things there, and I think we've felt some
pressure to improve the type system from a number of different
projects*.

Regards,
    Jeff Davis

* That being said, PostgreSQL's type system is actually very good.
Consider the sophisticated type infrastructure (or at least plumbing
around the type system) required to make KNN-GiST work, for instance.


Re: [HACKERS] Creating temp tables inside read only transactions

From
Craig Ringer
Date:
On 9/07/2011 11:27 AM, Robert Haas wrote:
> On Fri, Jul 8, 2011 at 2:21 AM, Darren Duncan<darren@darrenduncan.net>  wrote:
>> I think an even better way to support this is would be based on Postgres
>> having support for directly using multiple databases within the same SQL
>> session at once, as if namespaces were another level deep, the first level
>> being the databases, the second level the schemas, and the third level the
>> schema objects.
>>
>> Kind of like what the SQL standard defines its catalog/schema/object
>> namespaces.
>>
>> This instead of needing to use federating or that contrib module to use
>> multiple Pg databases of the same cluster at once.
>
> But if that's what you want, just don't put your data in different
> databases in the first place.  That's what schemas are for.

I think the part missing from that is that Pg does not currently provide
a mechanism to "connect" directly to a schema within a particular
database. You can log in and set search_path, of course, but it's a wee
bit clumsy and I suspect lots of people just don't get that.

pg_hba.conf cannot control schema access, either, so access control
based on IP address range or allowing different kinds of auth for
different users cannot be controlled on a schema level.

Being able to "connect" to a "database"."schema" location and have Pg
connect to the database then auto-set the search_path would address many
if not all of the use cases for cross-database queries.

That said, if there's ever a facility to WAL certain databases
separately and/or have different replication for different databases
within the same cluster, I can easily see the need coming up for
"big-unimportant-unreplicated-database" needing to query stuff from
"small-vital-replicated-database". By then, though, SQL-MED should fill
that need quite well enough.

Being able to:

    psql dbname.schemaname

or

    jdbc:postgresql://localhost/dbname.schemaname/

would probably address most of the other use cases, and make it much
easier for people migrating from databases that support cross-DB
queries. Thoughts?

--
Craig Ringer

POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088     Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

Re: [HACKERS] Creating temp tables inside read only transactions

From
Craig Ringer
Date:
On 9/07/2011 2:39 PM, Darren Duncan wrote:
> What if you used the context of the calling code and resolve in favor of
> whatever match is closest to it?

*BAD* idea IMHO.

It sounds attractive at first, but inevitably leads to problems where a
query used to work until someone creates a table/type/whatever that's
"closer" and suddenly things explode.

The existing search_path feature already suffers from issues like that,
and it's usually better to explicitly fully qualify names when you're
not just writing interactive code. The same potential foot-gun can be
exploited as an excellent and useful feature when you actually *want*
the same name to point to different things in different contexts, but in
general use it's more likely to create situations where change A breaks
unrelated query B.

--
Craig Ringer

POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088     Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

Re: [HACKERS] Creating temp tables inside read only transactions

From
Jeff Davis
Date:
On Fri, 2011-07-08 at 23:39 -0700, Darren Duncan wrote:
> What if you used the context of the calling code and resolve in favor of
> whatever match is closest to it?  The problem is related to general-purpose
> programming languages.
>
> Basically start looking in the lexical context for an "x" and if you find one
> use that; otherwise, assuming we're talking about referencing code that lives in
> the database such as a function, look at the innermost schema containing the
> referencing code and see if it has a direct child named "x"; otherwise go up one
> level to a parent schema, and so on until you get to the top, and finding none
> by then say it doesn't exist.

This is an example of where data languages and normal programming
languages have a crucial difference.

With a data language, you have this problem:
 1. An application uses a query referencing 'y.z.foo' that resolves to
internal object with fully-qualified name 'x.y.z'.
 2. An administrator creates object 'y.z.foo'.

Now, the application breaks all of a sudden.

In a normal prgramming language, if the schema of the two "foo"s are
different, the compiler could probably catch the error. SQL really has
no hope of catching it though.

PostgreSQL has this problem now in a couple ways, but it's much easier
to grasp what you might be conflicting with. If you have multiple nested
levels to traverse and different queries using different levels of
qualification, it gets a little more messy and I think a mistake is more
likely.

Regards,
    Jeff Davis


Re: [HACKERS] Creating temp tables inside read only transactions

From
Darren Duncan
Date:
Jeff Davis wrote:
> On Fri, 2011-07-08 at 23:39 -0700, Darren Duncan wrote:
>> What if you used the context of the calling code and resolve in favor of
>> whatever match is closest to it?  The problem is related to general-purpose
>> programming languages.
>>
>> Basically start looking in the lexical context for an "x" and if you find one
>> use that; otherwise, assuming we're talking about referencing code that lives in
>> the database such as a function, look at the innermost schema containing the
>> referencing code and see if it has a direct child named "x"; otherwise go up one
>> level to a parent schema, and so on until you get to the top, and finding none
>> by then say it doesn't exist.
>
> This is an example of where data languages and normal programming
> languages have a crucial difference.
>
> With a data language, you have this problem:
>  1. An application uses a query referencing 'y.z.foo' that resolves to
> internal object with fully-qualified name 'x.y.z'.
>  2. An administrator creates object 'y.z.foo'.
>
> Now, the application breaks all of a sudden.
>
> In a normal prgramming language, if the schema of the two "foo"s are
> different, the compiler could probably catch the error. SQL really has
> no hope of catching it though.
>
> PostgreSQL has this problem now in a couple ways, but it's much easier
> to grasp what you might be conflicting with. If you have multiple nested
> levels to traverse and different queries using different levels of
> qualification, it gets a little more messy and I think a mistake is more
> likely.

Well, my search path suggestion was based on Tom Lane's comment that "the SQL
spec requires us to be able to [support abbreviations]" and I expected it would
be syntactically and semantically backwards compatible with how things work now.

FYI, with Muldis D, being more green fields, there are no search paths in the
general case, and every entity reference is unambiguous because it has to be
fully-qualified.

However, I also support relative references, and in fact require their use for
references within the same database, which carries a number of benefits, at the
cost of being a few characters more verbose than when using a search path.  So
introducing new things with the same names in different namespaces won't break
anything there, even if they are "closer".  Its essentially like navigating a
Unix filesystem but with "." rather than "/".

So for example, if you had 2 sibling schemas "s1" and "s2", each with 2
functions "f1","f2" and a table "t", then s1.f1 would reference s1.f2 and s1.t
as sch.lib.f2 and sch.data.t respectively, while s1.f1 would refer to the
entities in s2 as sch.par.s2.lib.f1 and sch.par.s2.data.t and such (a function
can also refer to itself anonymously as "rtn" if it's recursive).  The "sch" is
like "." in Unix and the "par" is like ".." in Unix.  The "data" is for data
tables or views (and "cat" is for catalog tables/views) while "lib" is for
user-defined types, routines, constraints, etc (and "sys" is for built-in types
and routines, but "sys" may be omitted and search paths exist just for
built-ins).  Synonyms are also supported.

I don't expect you would adopt relative (fully-qualified) references, because
the syntax isn't in standard SQL (I think), but I did.  Unless you like them and
can come up with a syntax that will fit into how SQL does things.

-- Darren Duncan

Re: [HACKERS] Creating temp tables inside read only transactions

From
Michael Nolan
Date:


On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:

But if that's what you want, just don't put your data in different
databases in the first place.  That's what schemas are for.

Sadly, DBAs don't always have the ability to put all their data in one database, even if that is what schemas are for.

The ability to do cross-database (most likely cross-server as well) queries would address a lot of real-world problems.
-
Mike Nolan
nolan@tssi.com

Re: [HACKERS] Creating temp tables inside read only transactions

From
John R Pierce
Date:
On 07/10/11 9:01 PM, Michael Nolan wrote:
> Sadly, DBAs don't always have the ability to put all their data in one
> database, even if that is what schemas are for.
>
> The ability to do cross-database (most likely cross-server as well)
> queries would address a lot of real-world problems.

For which we have dblink ...



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: [HACKERS] Creating temp tables inside read only transactions

From
Christopher Browne
Date:
On Mon, Jul 11, 2011 at 12:01 AM, Michael Nolan <htfoot@gmail.com> wrote:
>
>
> On Fri, Jul 8, 2011 at 10:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> But if that's what you want, just don't put your data in different
>> databases in the first place.  That's what schemas are for.
>
> Sadly, DBAs don't always have the ability to put all their data in one
> database, even if that is what schemas are for.
>
> The ability to do cross-database (most likely cross-server as well) queries
> would address a lot of real-world problems.

Sure, there's quite a lot of use to that sort of thing.

Once 9.1 is out, with the FDW (Foreign Data Wrapper) concept, it'll
make a lot of sense to add FDW methods for various sorts of access
methods, including accessing other PG instances, cross-server.

We already have a fine mechanism for this; no need to create some
different mechanism out of whole cloth.

Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1",
there's a pretty good reason NOT to support that, namely that this
breaks relational handling of tables.  PostgreSQL is a *relational*
database system, hence it's preferable for structures to be
relational, as opposed to hierarchical, which is what any of the
suggested nestings are.

Having to write recursive queries just to look for fully qualified
table names is much more of a bug than it is a feature.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Re: [HACKERS] Creating temp tables inside read only transactions

From
Darren Duncan
Date:
Christopher Browne wrote:
> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1",
> there's a pretty good reason NOT to support that, namely that this
> breaks relational handling of tables.  PostgreSQL is a *relational*
> database system, hence it's preferable for structures to be
> relational, as opposed to hierarchical, which is what any of the
> suggested nestings are.

I won't argue with whether or not nested naming is a good idea, but I will argue
with your other comment about breaking relational handling.

A relational database is a database in which all data is kept in relation-typed
variables, which SQL calls tables, and you can perform all queries and updates
with just relation-valued expressions and statements.

Organizing the tables into a multi-level namespace, either fixed-depth or
variable-depth, rather than using a flat namespace, does not make the database
any less relational, because the above definition and any others still hold.

The "less relational" argument above is a red herring or distraction.  One can
argue against namespace nesting just fine without saying that.

-- Darren Duncan

Re: [HACKERS] Creating temp tables inside read only transactions

From
Florian Pflug
Date:
On Jul8, 2011, at 08:21 , Darren Duncan wrote:
> Also, the proper way to do temporary tables would be to put them in
> another database than the main one, where the whole other database
> has the property of being temporary.

FWIW, Microsoft SQL Server does it that way, and as a result temporary
tables are severely restricted in a number of ways.

For example, custom datatypes defined in a non-temporary database
cannot be used in temporary table definitions, because datatypes may
only be used within the database they're defined in. You can of course
re-define the data type in the temporary database, but then obviously
have to do so every time you start new session because you start out
with an empty tempdb.

best regards,
Florian Pflug


Re: [HACKERS] Creating temp tables inside read only transactions

From
"David Johnston"
Date:
Christopher Browne wrote:
> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1",
> there's a pretty good reason NOT to support that, namely that this
> breaks relational handling of tables.  PostgreSQL is a *relational*
> database system, hence it's preferable for structures to be
> relational, as opposed to hierarchical, which is what any of the
> suggested nestings are.

Organizing the tables into a multi-level namespace, either fixed-depth or variable-depth, rather than using a flat
namespace,does not make the database any less relational, because the above definition and any others still hold. 

The "less relational" argument above is a red herring or distraction.  One can argue against namespace nesting just
finewithout saying that. 

-- Darren Duncan

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
I agree with Darren.

One thought that came to my mind was to use a different separator between two namespaces and/or between the database
identifierand the rest of the "path". 
Examples:

ns1!ns2.table

OR

database@ns1.table

OR

database@ns1!ns2.table

I've been following only some of the discussion but it seems that much ambiguity would be lost by using different
separators. Schemas themselves are already non-standard so it isn't like we are constrained here in what is chosen. 

Just some quick thoughts I've had but haven't fully considered how they would fit in to the existing setup.  But is
thereis any major reason why choosing different separators would not work? 

Also, within search_path, some form of wild-card selector would be desirable:  ns1!*.  I'm not opposed to having to be
explicitabout the search_path in order to avoid name collisions; though it would be nice if VIEWS had some kind of
"SET"syntax, like functions do, so that the definer can specify the search_path that the view will resolve against. 

David J.





Re: [HACKERS] Creating temp tables inside read only transactions

From
Florian Pflug
Date:
On Jul11, 2011, at 07:08 , Darren Duncan wrote:
> Christopher Browne wrote:
>> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1",
>> there's a pretty good reason NOT to support that, namely that this
>> breaks relational handling of tables.  PostgreSQL is a *relational*
>> database system, hence it's preferable for structures to be
>> relational, as opposed to hierarchical, which is what any of the
>> suggested nestings are.
>
> A relational database is a database in which all data is kept in
> relation-typed variables, which SQL calls tables, and you can perform
> all queries and updates with just relation-valued expressions and
> statements.
>
> Organizing the tables into a multi-level namespace, either fixed-depth
> or variable-depth, rather than using a flat namespace, does not make
> the database any less relational, because the above definition and
> any others still hold.

The point was not, I think, that tables aren't suddenly relations once
namespaces are nested, but that the data model of the dbms *itself*,
i.e. the data model that defines the relationship between namespaces,
types, columns, type, ... becomes harder to map to the relational model.

For example, if namespaces can be nested, you'll need to resort to
recursive SQL and/or arrays far more often if you inspect the structure
of a database.

Btw, another argument against nested namespaces is that it actually
doesn't buy you anything in SQL, even if you solve the parsing
ambiguities. In programming languages, namespaces not only prevent
name clashes, the also defines the possible scopes to resolve unqualified
names with. For example, if you do
  void f() { printf("outer"); }

  namespace a {
    void f() { printf("inner"); }

    namespace b {
      void g() {f();}
    }
  }
in C++, then a::b::g() prints "inner". But in PostgreSQL, the scope in
which to resolve unqualified function is entirely determined by the the
search_path setting, *not* by the scope of the object containing the
unqualified name. Nested namespaces thus simply become of matter of
syntax - i.e., whether you can write a.b.c, or need to write "a.b".c.

best regards,
Florian Pflug


Re: [HACKERS] Creating temp tables inside read only transactions

From
Christopher Browne
Date:
On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug <fgp@phlo.org> wrote:
> On Jul11, 2011, at 07:08 , Darren Duncan wrote:
>> Christopher Browne wrote:
>>> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1",
>>> there's a pretty good reason NOT to support that, namely that this
>>> breaks relational handling of tables.  PostgreSQL is a *relational*
>>> database system, hence it's preferable for structures to be
>>> relational, as opposed to hierarchical, which is what any of the
>>> suggested nestings are.
>>
>> A relational database is a database in which all data is kept in
>> relation-typed variables, which SQL calls tables, and you can perform
>> all queries and updates with just relation-valued expressions and
>> statements.
>>
>> Organizing the tables into a multi-level namespace, either fixed-depth
>> or variable-depth, rather than using a flat namespace, does not make
>> the database any less relational, because the above definition and
>> any others still hold.
>
> The point was not, I think, that tables aren't suddenly relations once
> namespaces are nested, but that the data model of the dbms *itself*,
> i.e. the data model that defines the relationship between namespaces,
> types, columns, type, ... becomes harder to map to the relational model.

Just so.

It's not that it suddenly "becomes no longer relational".

Rather, the argument is that "it was intentional for the structuring
of table naming to, itself, be relational," and changing that
definitely has some undesirable characteristics.

The need for recursive queries is the most obvious "undesirable", but
it's not the only undesirable thing, by any means.

Sure, there's some cool stuff that we can get out of nested
namespaces, but I think we'd pay a pretty big price for it, and it
shouldn't be treated as "obvious" that:
a) It's a good thing to do so,
b) It is desirable to do so,
c) There will be agreement to do so.

To the contrary, there are pretty good reasons to reject the idea.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Re: [HACKERS] Creating temp tables inside read only transactions

From
"David Johnston"
Date:
On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug <fgp@phlo.org> wrote:
> On Jul11, 2011, at 07:08 , Darren Duncan wrote:
>> Christopher Browne wrote:
>>> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1",
>>> there's a pretty good reason NOT to support that, namely that this
>>> breaks relational handling of tables.  PostgreSQL is a *relational*
>>> database system, hence it's preferable for structures to be
>>> relational, as opposed to hierarchical, which is what any of the
>>> suggested nestings are.

>>Rather, the argument is that "it was intentional for the structuring of
table naming to, itself, be relational," and changing that definitely has
some undesirable characteristics.

>>The need for recursive queries is the most obvious "undesirable", but it's
not the only undesirable thing, by any means.

I do not see how recursive queries (really iteration of records) even enters
the picture...

Right now I can emulate a hierarchical schema structure via a naming scheme
- for example  "schemabase_sub1_sub2_etc".  I am simply looking for a formal
way to do the above AND also tell the system that I want all schemas under
"schemabase" to be in the search path.  Heck, I guess just allowing for
simply pattern matching in "search_path" would be useful in this case
regardless of the presence of an actual schema hierarchy.  Using "LIKE"
syntax say: "SET search_path TO schemabase_sub1_%" or something similar.
The only missing ability becomes a way for graphical tools to represent the
schema "hierarchy" using a tree-structure with multiple depths.

I can see how adding "." and ".." and relative paths would confuse the issue
those are not necessary features of a multi-level schema depth.

The above, combined with a different separator for intra-level
namespace/schema delineation, would allow for an unambiguous way to define
and use a hierarchical schema with seemingly minimal invasion into the
current way of doing things. You could almost implement it just by requiring
a specific character to act as the separator and then construct the actual
schema using single-level literals and supporting functions that can convert
them into an hierarchy.  In other words, the schema table would still only
contain one field with the full "parent!child" as opposed to (schema,
parent) with (VALUES('parent',null),('child','parent')).

In other words, if we use "!" as the separator, any schema named
"parent!child"  could be stored and referenced as such but then if you run a
"getChildren(parent)" function it would return "child" along with any other
schemas of the form "parent!%".  In this case the "%" sign could maybe only
match everything except "!" and the "*" symbol could be used to match "!" as
well.

I could give more examples but I hope the basic idea is obvious.  The main
thing is that the namespace hierarchy usage is standardized in such a way
that pgAdmin and other GUI tools can reliably use for display purposes and
that "search_path" can be constructed in a more compact format so that every
schema and sub-schema is still absolutely referenced (you can even have the
SET command resolve search_path at execution time and then remain static
just like "CREATE VIEW SELECT * FROM table".

David J.




Re: [HACKERS] Creating temp tables inside read only transactions

From
Darren Duncan
Date:
I will put my support for David Johnston's proposal, in principle, though minor
details of syntax could be changed if using "!" conflicts with something. --
Darren Duncan

David Johnston wrote:
> On Mon, Jul 11, 2011 at 10:12 AM, Florian Pflug <fgp@phlo.org> wrote:
>> On Jul11, 2011, at 07:08 , Darren Duncan wrote:
>>> Christopher Browne wrote:
>>>> Vis-a-vis the attempt to do nested naming, that is "ns1.ns2.table1",
>>>> there's a pretty good reason NOT to support that, namely that this
>>>> breaks relational handling of tables.  PostgreSQL is a *relational*
>>>> database system, hence it's preferable for structures to be
>>>> relational, as opposed to hierarchical, which is what any of the
>>>> suggested nestings are.
>
>>> Rather, the argument is that "it was intentional for the structuring of
> table naming to, itself, be relational," and changing that definitely has
> some undesirable characteristics.
>
>>> The need for recursive queries is the most obvious "undesirable", but it's
> not the only undesirable thing, by any means.
>
> I do not see how recursive queries (really iteration of records) even enters
> the picture...
>
> Right now I can emulate a hierarchical schema structure via a naming scheme
> - for example  "schemabase_sub1_sub2_etc".  I am simply looking for a formal
> way to do the above AND also tell the system that I want all schemas under
> "schemabase" to be in the search path.  Heck, I guess just allowing for
> simply pattern matching in "search_path" would be useful in this case
> regardless of the presence of an actual schema hierarchy.  Using "LIKE"
> syntax say: "SET search_path TO schemabase_sub1_%" or something similar.
> The only missing ability becomes a way for graphical tools to represent the
> schema "hierarchy" using a tree-structure with multiple depths.
>
> I can see how adding "." and ".." and relative paths would confuse the issue
> those are not necessary features of a multi-level schema depth.
>
> The above, combined with a different separator for intra-level
> namespace/schema delineation, would allow for an unambiguous way to define
> and use a hierarchical schema with seemingly minimal invasion into the
> current way of doing things. You could almost implement it just by requiring
> a specific character to act as the separator and then construct the actual
> schema using single-level literals and supporting functions that can convert
> them into an hierarchy.  In other words, the schema table would still only
> contain one field with the full "parent!child" as opposed to (schema,
> parent) with (VALUES('parent',null),('child','parent')).
>
> In other words, if we use "!" as the separator, any schema named
> "parent!child"  could be stored and referenced as such but then if you run a
> "getChildren(parent)" function it would return "child" along with any other
> schemas of the form "parent!%".  In this case the "%" sign could maybe only
> match everything except "!" and the "*" symbol could be used to match "!" as
> well.
>
> I could give more examples but I hope the basic idea is obvious.  The main
> thing is that the namespace hierarchy usage is standardized in such a way
> that pgAdmin and other GUI tools can reliably use for display purposes and
> that "search_path" can be constructed in a more compact format so that every
> schema and sub-schema is still absolutely referenced (you can even have the
> SET command resolve search_path at execution time and then remain static
> just like "CREATE VIEW SELECT * FROM table".
>
> David J.
>
>
>
>


Re: [HACKERS] Creating temp tables inside read only transactions

From
Chris Travers
Date:
On Mon, Jul 11, 2011 at 12:49 PM, David Johnston <polobo@yahoo.com> wrote:

> I do not see how recursive queries (really iteration of records) even enters
> the picture...

I agree, FWIW.  If the feature was that desirable, we could look at
questions of implementation to make recursion either unnecessary or at
least well managed.
>
> Right now I can emulate a hierarchical schema structure via a naming scheme
> - for example  "schemabase_sub1_sub2_etc".  I am simply looking for a formal
> way to do the above AND also tell the system that I want all schemas under
> "schemabase" to be in the search path.  Heck, I guess just allowing for
> simply pattern matching in "search_path" would be useful in this case
> regardless of the presence of an actual schema hierarchy.  Using "LIKE"
> syntax say: "SET search_path TO schemabase_sub1_%" or something similar.
> The only missing ability becomes a way for graphical tools to represent the
> schema "hierarchy" using a tree-structure with multiple depths.

Right.  Semantically myapp_schemaname_subschemaname is no less
hierarchical than myapp.schemaname.subschemaname.  The larger issue is
that of potential ambiguity wrt cross-database references (I don't
have a lot of experience reading the SQL standards, but seeing how
different db's implement cross-db references suggests that the
standards contemplate semantic meaning to depth of the namespace).

>
> I can see how adding "." and ".." and relative paths would confuse the issue
> those are not necessary features of a multi-level schema depth.
>
> The above, combined with a different separator for intra-level
> namespace/schema delineation, would allow for an unambiguous way to define
> and use a hierarchical schema with seemingly minimal invasion into the
> current way of doing things. You could almost implement it just by requiring
> a specific character to act as the separator and then construct the actual
> schema using single-level literals and supporting functions that can convert
> them into an hierarchy.  In other words, the schema table would still only
> contain one field with the full "parent!child" as opposed to (schema,
> parent) with (VALUES('parent',null),('child','parent')).
>
> In other words, if we use "!" as the separator, any schema named
> "parent!child"  could be stored and referenced as such but then if you run a
> "getChildren(parent)" function it would return "child" along with any other
> schemas of the form "parent!%".  In this case the "%" sign could maybe only
> match everything except "!" and the "*" symbol could be used to match "!" as
> well.

Agreed that this would be helpful.  I would personally have a lot of
use for this sort of feature, particularly with managing large numbers
of stored procedures.  Right now I am using a double underscore which
is error-prone.

Best Wishes,
Chris Travers

Re: [HACKERS] Creating temp tables inside read only transactions

From
Florian Pflug
Date:
On Jul11, 2011, at 21:49 , David Johnston wrote:
> Right now I can emulate a hierarchical schema structure via a naming scheme
> - for example  "schemabase_sub1_sub2_etc".  I am simply looking for a formal
> way to do the above AND also tell the system that I want all schemas under
> "schemabase" to be in the search path.  Heck, I guess just allowing for
> simply pattern matching in "search_path" would be useful in this case
> regardless of the presence of an actual schema hierarchy.  Using "LIKE"
> syntax say: "SET search_path TO schemabase_sub1_%" or something similar.

create function set_searchpath_expand(v_pattern text) returns void as $$
declare
  v_searchpath text;
begin
  select string_agg(quote_ident(nspname), ',') into v_searchpath
    from pg_catalog.pg_namespace where nspname like v_pattern;
  execute 'set search_path = ' || v_searchpath;
end
$$ language plpgsql;

best regards,
Florian Pflug


Re: [HACKERS] Creating temp tables inside read only transactions

From
Alban Hertroys
Date:
On 12 Jul 2011, at 3:25, Chris Travers wrote:

>> Right now I can emulate a hierarchical schema structure via a naming scheme
>> - for example  "schemabase_sub1_sub2_etc".  I am simply looking for a formal
>> way to do the above AND also tell the system that I want all schemas under
>> "schemabase" to be in the search path.  Heck, I guess just allowing for
>> simply pattern matching in "search_path" would be useful in this case
>> regardless of the presence of an actual schema hierarchy.  Using "LIKE"
>> syntax say: "SET search_path TO schemabase_sub1_%" or something similar.
>> The only missing ability becomes a way for graphical tools to represent the
>> schema "hierarchy" using a tree-structure with multiple depths.
>
> Right.  Semantically myapp_schemaname_subschemaname is no less
> hierarchical than myapp.schemaname.subschemaname.

It is. To the database, your first example is a single identifier, while your second example is a sequential
combinationof three identifiers. The second one contains explicit hierarchy, the first one does not. 

It's quite possible that the fact that these identifiers have a sequence is the biggest problem for implementing this
ina relational database. Relational databases work with sets after all, which have no explicit sequence. With the
introductionof recursive queries that's _possible_, but as claimed earlier (and I tend to agree), for performance
reasonsit is undesirable to apply this to system tables. 

If we were talking about a _set_ of identifiers instead, without the requirement of a hierarchy (eg.
myapp.schemaname.subschemaname= subschemaname.myapp.schemaname), implementation would probably be easier/perform
better.

That does have some interesting implications for incompletely specified sets of namespaces, I'm not sure how desirable
theyare. 
What's cool is that you can specify just a server hostname and a table-name and (as long as there's no ambiguity)
that'ssufficient. 
Not so cool, if you use the above and someone clones the database on said host, you've suddenly introduced ambiguity.

Problems like that are likely to turn up with incomplete identifier specifications though, just a bit more likely to
happenif you take the meaning of the sequence of the identifiers out. Just a bit. 

> The larger issue is that of potential ambiguity wrt cross-database references.


Not necessarily, if the reference to a remote database is unambiguously recognisable as such, for example by using some
kindof URI notation (eg. dsn://user@remote-database1). 

I'm also wondering how to handle this for multi-master replicated environments, in view of load-balancing. Those remote
databasereferences probably need to reference different databases depending on which master they're running on? 

From a security point-of-view I'd probably require a list of accessible remote databases per server (so that people
cannotjust query any database of their choice). That could also serve the load-balancing scenario. 

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4e1c1e1012091390850944!