Thread: Changing the concept of a DATABASE

Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 21 May 2012 20:40, Stephen Frost <sfrost@snowman.net> wrote:

>> This is important. I like the idea of breaking down the barriers
>> between databases to allow it to be an option for one backend to
>> access tables in multiple databases. The current mechanism doesn't
>> actually prevent looking at data from other databases using internal
>> APIs, so full security doesn't exist. It's a very common user
>> requirement to wish to join tables stored in different databases,
>> which ought to be possible more cleanly with correct privileges.
>
> That's really a whole different ball of wax and I don't believe what
> Robert was proposing would actually allow that to happen due to the
> other database-level things which are needed to keep everything
> consistent...  That's my understanding, anyway.  I'd be happy as anyone
> if we could actually make it work, but isn't like the SysCache stuff per
> database?  Also, cross-database queries would actually make it more
> difficult to have per-database roles, which is one thing that I was
> hoping we might be able to work into this, though perhaps we could have
> a shared roles table and a per-database roles table and only 'global'
> roles would be able to issue cross-database queries..

So collecting a few requirements from various places:

* Ability to have a Role that can only access one Database

* Allow user info to be dumped with a database, to make a db
completely self-consistent

* Allow databases to be transportable

* Allow users to access tables in >1 database easily, with appropriate rights.


I don't see any reasons why these things would be against each other.

The main objectives are to make a Database a more easily used
administrative grouping. At present, people who use multiple Databases
face many problems - they aren't as separate as you'd like, but
neither can they be ignored when required.

The idea of "one main database per session" is fine, but wiring it so
closely into the backend has a few disadvantages, many of them weird
internal things.

Are there arguments against those requirements before we spend time on
design/thinking?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Changing the concept of a DATABASE

From
José Luis Tallón
Date:
On 22/05/12 11:46, Simon Riggs wrote:
> On 21 May 2012 20:40, Stephen Frost<sfrost@snowman.net>  wrote:
>>> This is important. I like the idea of breaking down the barriers
>>> between databases to allow it to be an option for one backend to
>>> access tables in multiple databases. The current mechanism doesn't
>>> actually prevent looking at data from other databases using internal
>>> APIs, so full security doesn't exist. It's a very common user
>>> requirement to wish to join tables stored in different databases,
>>> which ought to be possible more cleanly with correct privileges.
>> That's really a whole different ball of wax and I don't believe what
>> Robert was proposing would actually allow that to happen due to the
>> other database-level things which are needed to keep everything
>> consistent...  That's my understanding, anyway.  I'd be happy as anyone
>> if we could actually make it work, but isn't like the SysCache stuff per
>> database?  Also, cross-database queries would actually make it more
>> difficult to have per-database roles, which is one thing that I was
>> hoping we might be able to work into this, though perhaps we could have
>> a shared roles table and a per-database roles table and only 'global'
>> roles would be able to issue cross-database queries..

IMVHO:  s/database/schema/g does resolve many of the problems that you
were referring to... and 'dblink' should solve the rest, right?
Please, feel free to point out what I am (most probably) not considering
-- not experienced enough yet :)

On the other hand, the separation of databases allows what otherwise
would only be possible by using multiple instances of the database
server (à la Oracle, AFAIK ) -- save for resource management, but that
is another question whatsoever.

> So collecting a few requirements from various places:
>
> * Ability to have a Role that can only access one Database

Yes, please

> * Allow user info to be dumped with a database, to make a db
> completely self-consistent
+1

> * Allow databases to be transportable
+1. Ideally, the binary format could be make platform-independent, so
that a snapshot/rsync of the cluster can span architectures easily.
AFAIK, endianness-change is relatively cheap on current processors [1
ASM instruction?] and it's not like we are memory-mapping tuples anyway
(TOASTed values can certainly not be mapped), so it shouldn't be
noticeable performance-wise.

> * Allow users to access tables in>1 database easily, with appropriate rights.

See above, but I am probably wrong ...


> I don't see any reasons why these things would be against each other.

Look quite orthogonal to me.

> The main objectives are to make a Database a more easily used
> administrative grouping. At present, people who use multiple Databases
> face many problems - they aren't as separate as you'd like, but
> neither can they be ignored when required.
>
> The idea of "one main database per session" is fine, but wiring it so
> closely into the backend has a few disadvantages, many of them weird
> internal things.
>
> Are there arguments against those requirements before we spend time on
> design/thinking?

OTOH, the postmaster/cluster - session/database coupling looks to me
clean, simple... and seems to make the code simpler. This is can only be
good (but again, I don't know enough yet to be sure)

Regards,
    Jose Luis Tallon





Re: Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 22 May 2012 12:05, José Luis Tallón <jltallon@nosys.es> wrote:

> IMVHO:  s/database/schema/g does resolve many of the problems that you were
> referring to... and 'dblink' should solve the rest, right?
> Please, feel free to point out what I am (most probably) not considering --
> not experienced enough yet :)

The choice of schema/database is an important one. If you get it
wrong, you are in major difficulty. In many cases schemas would be a
better choice, but not in all cases. So I'm interested in solving the
problems for people who have multiple databases on same server.

dblink is the only solution, but its very poor way to do this when we
have 2 databases on same server.

My thinking is that reaching out to multiple databases is actually
mostly easy, except in a few places where dbid is hardwired into the
backend.

> On the other hand, the separation of databases allows what otherwise would
> only be possible by using multiple instances of the database server (à la
> Oracle, AFAIK ) -- save for resource management, but that is another
> question whatsoever.

Separation of databases is fine. I have no intention to change that,
as long as the user wishes that.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Changing the concept of a DATABASE

From
Florian Pflug
Date:
On May22, 2012, at 11:46 , Simon Riggs wrote:
> * Ability to have a Role that can only access one Database
>
> * Allow user info to be dumped with a database, to make a db
> completely self-consistent

These two could be achieved by having database-local roles I think.

> * Allow databases to be transportable

That would be very useful, but extremely hard to do unless we switch to
per-database XID spaces. Or unless we're content with having databases
only be transportable after some special "MAKE TRANSPORTABLE" operation,
which would freeze all tuples and make the database read-only.

> * Allow users to access tables in >1 database easily, with appropriate rights.

That one I'm very sceptical about. In the long run, I think we want better
separation of databases, not less, and this requirement carries a huge risk
of standing in the way of that. Also, I think that once we integrate the postgres
FDW into core (that's the plan in the long run, right?), we're going to get
a good approximation of that essentially for free.

best regards,
Florian Pflug



Re: Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 22 May 2012 12:35, Florian Pflug <fgp@phlo.org> wrote:

>> * Allow users to access tables in >1 database easily, with appropriate rights.
>
> That one I'm very sceptical about. In the long run, I think we want better
> separation of databases, not less, and this requirement carries a huge risk
> of standing in the way of that. Also, I think that once we integrate the postgres
> FDW into core (that's the plan in the long run, right?), we're going to get
> a good approximation of that essentially for free.

It's a poor approximation of it, free or not.

If it actually worked well, I'd be happy. It doesn't. No proper
transaction support, no session pool, poor planning etc etc. At best
its a band-aid, not something sufficiently good to recommend for
general production use.

FDWs are very good but aren't good enough for intra-database usage.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Changing the concept of a DATABASE

From
Robert Haas
Date:
On Tue, May 22, 2012 at 7:35 AM, Florian Pflug <fgp@phlo.org> wrote:
>> * Allow users to access tables in >1 database easily, with appropriate rights.
>
> That one I'm very sceptical about. In the long run, I think we want better
> separation of databases, not less, and this requirement carries a huge risk
> of standing in the way of that. Also, I think that once we integrate the postgres
> FDW into core (that's the plan in the long run, right?), we're going to get
> a good approximation of that essentially for free.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Changing the concept of a DATABASE

From
José Luis Tallón
Date:
On 22/05/12 13:24, Simon Riggs wrote:
> On 22 May 2012 12:05, José Luis Tallón<jltallon@nosys.es>  wrote:
>
>> IMVHO:  s/database/schema/g does resolve many of the problems that you were
>> referring to... and 'dblink' should solve the rest, right?
>> Please, feel free to point out what I am (most probably) not considering --
>> not experienced enough yet :)
> The choice of schema/database is an important one. If you get it
> wrong, you are in major difficulty. In many cases schemas would be a
> better choice, but not in all cases. So I'm interested in solving the
> problems for people who have multiple databases on same server.

Ok. Understood.
Thank you for the clarification

> dblink is the only solution, but its very poor way to do this when we
> have 2 databases on same server.
>
> My thinking is that reaching out to multiple databases is actually
> mostly easy, except in a few places where dbid is hardwired into the
> backend.

The only drawback I see is that it might weaken the separation.

Even though arguably a kludge, dblink could have a "shortcut" added,
whereby connections to another database within the same cluster would be
serviced directly within the backend, as opossed to opening a new db
connection. This is effectively a fastpath within dblink, which
optimizes a relatively common case while at the same time not loosing
generality.

>> On the other hand, the separation of databases allows what otherwise would
>> only be possible by using multiple instances of the database server (à la
>> Oracle, AFAIK ) -- save for resource management, but that is another
>> question whatsoever.
> Separation of databases is fine. I have no intention to change that,
> as long as the user wishes that.

Perfect.

Thanks,
    Jose Luis Tallon



Re: Changing the concept of a DATABASE

From
José Luis Tallón
Date:
On 22/05/12 13:47, Simon Riggs wrote:
> On 22 May 2012 12:35, Florian Pflug<fgp@phlo.org>  wrote:
>>> * Allow users to access tables in>1 database easily, with appropriate rights.
>> That one I'm very sceptical about. In the long run, I think we want better
>> separation of databases, not less, and this requirement carries a huge risk
>> of standing in the way of that. Also, I think that once we integrate the postgres
>> FDW into core (that's the plan in the long run, right?), we're going to get
>> a good approximation of that essentially for free.
> It's a poor approximation of it, free or not.
>
> If it actually worked well, I'd be happy. It doesn't. No proper
> transaction support, no session pool, poor planning etc etc. At best
> its a band-aid, not something sufficiently good to recommend for
> general production use.
- Transaction support: should be fixed, I guess.

- Session pool: is this really needed? I would it externally -- for 
example, by specifying a connection string to a pgPool/pgBouncer as 
opposed to directly to the db server.

- Planning: add a tunable specifying a higher cost (with a exception for 
cluster-local dblinks, which would have a lower cost), and the rest 
should be straightforward. Of course, planning would'nt be as accurate 
---we can't access the other db statistics in order to potentially 
rewrite conditions---, but I don't think that would be a proper approach 
(separation concerns, again)

> FDWs are very good but aren't good enough for intra-database usage.

The optimization I just proposed (plus the required fixes to FDW) might 
very well solve this, while providing useful enhancements for all users, 
whether they are accessing cluster-local databases or not (or even truly 
foreign datasources)


Regards,
    Jose Luis Tallon



Re: Changing the concept of a DATABASE

From
Florian Pflug
Date:
On May22, 2012, at 13:47 , Simon Riggs wrote:
> On 22 May 2012 12:35, Florian Pflug <fgp@phlo.org> wrote:
>>> * Allow users to access tables in >1 database easily, with appropriate rights.
>>
>> That one I'm very sceptical about. In the long run, I think we want better
>> separation of databases, not less, and this requirement carries a huge risk
>> of standing in the way of that. Also, I think that once we integrate the postgres
>> FDW into core (that's the plan in the long run, right?), we're going to get
>> a good approximation of that essentially for free.
>
> It's a poor approximation of it, free or not.
>
> If it actually worked well, I'd be happy. It doesn't. No proper
> transaction support, no session pool, poor planning etc etc. At best
> its a band-aid, not something sufficiently good to recommend for
> general production use.

These all sound fixable, though.

> FDWs are very good but aren't good enough for intra-database usage.

OTOH, cross-database queries would by design be limited to databases
within one cluster, whereas a FDW-based solution would not. I don't really see
the different between telling people "put all your tables into one database
if you need to access them from within one session" and "put all your databases
on one server if you need to access them from within one session".

Plus, the more tightly different databases in the same cluster are coupled,
the more people will setup one cluster per database for performance reasons.
Then, when they discovered they need inter-database queries after all, we'll
again have to tell them "well, then don't set things up the way you have".

If we want to make it easier for people to migrate from multiple databases
to a single database with multiple schemas, maybe we should look into allowing
nested schemas? AFAIK, the main reason not to do that are ambiguities in the
meaning of identifiers, which cross-database queries would have to deal with
also.

best regards,
Florian Pflug



Re: Changing the concept of a DATABASE

From
Andrew Dunstan
Date:

On 05/22/2012 07:56 AM, Robert Haas wrote:
> On Tue, May 22, 2012 at 7:35 AM, Florian Pflug<fgp@phlo.org>  wrote:
>>> * Allow users to access tables in>1 database easily, with appropriate rights.
>> That one I'm very sceptical about. In the long run, I think we want better
>> separation of databases, not less, and this requirement carries a huge risk
>> of standing in the way of that. Also, I think that once we integrate the postgres
>> FDW into core (that's the plan in the long run, right?), we're going to get
>> a good approximation of that essentially for free.
> +1.



That seems to be leaving aside the fact that we don't currently have any 
notion of how to allow FDWs to write the foreign tables.

What is more, isn't the postgres FDW about talking to any postgres 
source? If so, does it have special smarts for when we are talking to 
ourselves? And if it doesn't then it seems unlikely to be an acceptable 
substitute for allowing talking direct to a sibling database.

I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW 
would adequately meet the case if we wanted to go that way.

cheers

andrew


Per-Database Roles

From
Stephen Frost
Date:
* Simon Riggs (simon@2ndQuadrant.com) wrote:
> * Ability to have a Role that can only access one Database

Alright, I'd like to think about this one specifically and solicit
feedback on the idea that we keep the existing shared role tables but
add on additional tables for per-database roles.

In the past, I feel like we've been focused on the idea of moving all
roles to be per-database instead of per-cluster, which certainly has a
lot of problems associated with it, but in the end, I think people would
be really happy with some shared roles and some per-DB roles.

What would the semantics of that look like though?  Which is "preferred"
when you do a 'grant select' or 'grant role'?  Or do we just disallow
overlaps between per-DB roles and global roles?  If we don't allow
duplicates, I suspect a lot of the other questions suddenly become a lot
easier to deal with, but would that be too much of a restriction?  How
would you handle migrating an existing global role to a per-database
role?
Thanks,
    Stephen

Re: Changing the concept of a DATABASE

From
Robert Haas
Date:
On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> That seems to be leaving aside the fact that we don't currently have any
> notion of how to allow FDWs to write the foreign tables.
>
> What is more, isn't the postgres FDW about talking to any postgres source?
> If so, does it have special smarts for when we are talking to ourselves? And
> if it doesn't then it seems unlikely to be an acceptable substitute for
> allowing talking direct to a sibling database.
>
> I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would
> adequately meet the case if we wanted to go that way.

Well, I don't think anyone is claiming that FDWs as they exist today
solve all of the problems in this area.  But I think that improving
FDWs is a more promising line of attack than trying to make backends
talk to multiple databases.  Doing the latter will require massive
surgery on the relcache, the catcache, most of the secondary catalog
caches, the ProcArray, and every portion of the backend that thinks an
OID uniquely identifies an SQL object.  Basically, they'd all need
database OID as an additional key field, which is undesirable for
performance reasons even if there were no issue of code churn.

So I'd rather see us put the effort into pgsql_fdw, which, as Florian
says, will also let us talk to a completely separate server.  If
you've got multiple databases in the same cluster and really need to
be doing queries across all of them, that's what schemas are
supposedly for.  Now I know that people feel that doesn't work as well
as it needs to, but there again I think it would be easier to fix
schemas than to make cross-database queries work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Changing the concept of a DATABASE

From
"Albe Laurenz"
Date:
Simon Riggs wrote:
> On 21 May 2012 20:40, Stephen Frost <sfrost@snowman.net> wrote:
>
>>> This is important. I like the idea of breaking down the barriers
>>> between databases to allow it to be an option for one backend to
>>> access tables in multiple databases.

> So collecting a few requirements from various places:
[...]
> * Allow users to access tables in >1 database easily, with appropriate
rights.

> The main objectives are to make a Database a more easily used
> administrative grouping. At present, people who use multiple Databases
> face many problems - they aren't as separate as you'd like, but
> neither can they be ignored when required.
>
> The idea of "one main database per session" is fine, but wiring it so
> closely into the backend has a few disadvantages, many of them weird
> internal things.
>
> Are there arguments against those requirements before we spend time on
> design/thinking?

From my perspective it is a great advantage that a user connected
to one database cannot access objects from a different database
without additional software, no matter what permissions he or she has
(short of superuser, who can do anything).

This enables us to have many different databases in one cluster
without having to worry that they can affect each other.

If you need different applications to routinely access each other's
tables, why not assign them to different schemas in one database?

For those cases where you absolutely need access to a different
database, you can use dblink or a foreign data wrapper (hopefully
in 9.3).

So -1 on that particular suggestion.

Yours,
Laurenz Albe


Re: Per-Database Roles

From
Thom Brown
Date:
On 22 May 2012 14:04, Stephen Frost <sfrost@snowman.net> wrote:

> What would the semantics of that look like though?  Which is "preferred"
> when you do a 'grant select' or 'grant role'?  Or do we just disallow
> overlaps between per-DB roles and global roles?  If we don't allow
> duplicates, I suspect a lot of the other questions suddenly become a lot
> easier to deal with, but would that be too much of a restriction?  How
> would you handle migrating an existing global role to a per-database
> role?

Perhaps:

CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ]
option [ ... ] ]

Then:

CREATE LOCAL ROLE localrolename LIKE globalrolename;

REASSIGN OWNED BY globalrolename TO localrolename;

Conflicts would occur where localrolename matches an existing local
role name within the same database, or a global role name, but not a
local role name within another database.  The problem with this,
however, is that creating global roles would need conflict checks
against local roles in every database, unless a manifest of all local
roles were registered globally.

--
Thom


Re: Per-Database Roles

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> Conflicts would occur where localrolename matches an existing local
> role name within the same database, or a global role name, but not a
> local role name within another database.  The problem with this,
> however, is that creating global roles would need conflict checks
> against local roles in every database, unless a manifest of all local
> roles were registered globally.

Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
want local and global roles to have nonconflicting OIDs, else it's
necessary to carry around an indication of which type each role is;
which would be more or less a show-stopper in terms of the number of
catalogs and internal APIs affected.  But I don't currently see any
nice way to guarantee that if each database has a private table of
local roles.

You could possibly make it work if all roles, local and global, are
stored in a single shared catalog.  But that seems pretty ugly.

BTW, I wonder whether this type of problem isn't also pretty fatal for
the sorts of hierarchical catalog structures we were speculating about
at PGCon.  When we were talking about that I was sort of assuming that
the more-closely-nested levels could just hide conflicting objects at
outer levels, but on reflection that seems a bit scary.
        regards, tom lane


Re: Per-Database Roles

From
Robert Haas
Date:
On Tue, May 22, 2012 at 9:37 AM, Thom Brown <thom@linux.com> wrote:
> On 22 May 2012 14:04, Stephen Frost <sfrost@snowman.net> wrote:
>> What would the semantics of that look like though?  Which is "preferred"
>> when you do a 'grant select' or 'grant role'?  Or do we just disallow
>> overlaps between per-DB roles and global roles?  If we don't allow
>> duplicates, I suspect a lot of the other questions suddenly become a lot
>> easier to deal with, but would that be too much of a restriction?  How
>> would you handle migrating an existing global role to a per-database
>> role?
>
> Perhaps:
>
> CREATE [ GLOBAL | LOCAL ] ROLE name [ LIKE role_name ] [ [ WITH ]
> option [ ... ] ]
>
> Then:
>
> CREATE LOCAL ROLE localrolename LIKE globalrolename;
>
> REASSIGN OWNED BY globalrolename TO localrolename;
>
> Conflicts would occur where localrolename matches an existing local
> role name within the same database, or a global role name, but not a
> local role name within another database.  The problem with this,
> however, is that creating global roles would need conflict checks
> against local roles in every database, unless a manifest of all local
> roles were registered globally.

There are race conditions to worry about, too.  In most cases, we rely
on the btree index machinery as a final backstop against duplicate
catalog entries.  But that doesn't work unless everything's in one
catalog, nor for anything more complicated than "this set of columns
taken together should be unique over every role".  Even if we were OK
with incurring the ugliness of storing per-database roles in a shared
catalog, the uniqueness constraint you'd want is something like "no
two roles can share the same name unless they have unequal database
OIDs neither of which is zero", which I don't believe we can enforce
via the btree machinery, at least not without an expression index that
won't work in a system catalog anyway.

In retrospect, I think the idea of shared catalogs was probably a bad
idea.  I think we should have made roles and tablespaces database
objects rather than shared objects, and come up with some ad-hoc
method of representing the set of available databases.  But that
decision seems to have been made sometime pre-1996, so the thought of
changing it now is pretty painful, but I can dream...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Per-Database Roles

From
Stephen Frost
Date:
* Thom Brown (thom@linux.com) wrote:
> Conflicts would occur where localrolename matches an existing local
> role name within the same database, or a global role name, but not a
> local role name within another database.  The problem with this,
> however, is that creating global roles would need conflict checks
> against local roles in every database, unless a manifest of all local
> roles were registered globally.

Hmm, right, that's a bit of a sticky point.  wrt your suggestion- it
works great if we don't allow duplicates, but then people have to accept
their role name is getting changed.  That said, perhaps that's not that
big of a deal, since you could rename it afterwards.

The issue with the constraints on other databases might actually be
cause enough to allow duplicates, just to avoid that issue..  We could
mirror the per-database roles into a shared space that isn't externally
visible, but at that point, maybe we should try to get RLS for the
catalogs instead, or just modify the views to only show roles which can
connect to this database.  That's not going to make them completely
transparent, but it might be enough for some use cases.
Thanks,
    Stephen

Re: Changing the concept of a DATABASE

From
Susanne Ebrecht
Date:
Am 22.05.2012 15:27, schrieb Albe Laurenz:
> If you need different applications to routinely access each other's
> tables, why not assign them to different schemas in one database?

The use case in my mind for accessing more databases is when you want to
access stuff different languages.

You only can set encoding / LC_Collate per database not per schema.

So for different languages you might need different databases to do 
correct sorting / indexing.

Susanne

-- 
Dipl. Inf. Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com



Re: Changing the concept of a DATABASE

From
Tom Lane
Date:
Susanne Ebrecht <susanne@2ndquadrant.com> writes:
> The use case in my mind for accessing more databases is when you want to
> access stuff different languages.

> You only can set encoding / LC_Collate per database not per schema.

> So for different languages you might need different databases to do 
> correct sorting / indexing.

Encoding yes, but since 9.1 we have pretty fine-grained control of
collation.  So I think this argument is a lot weaker than it used
to be.  It would only really apply if you have one of the corner
cases where utf8 doesn't work for you.
        regards, tom lane


Re: Per-Database Roles

From
Florian Pflug
Date:
On May22, 2012, at 16:09 , Tom Lane wrote:
> Thom Brown <thom@linux.com> writes:
>> Conflicts would occur where localrolename matches an existing local
>> role name within the same database, or a global role name, but not a
>> local role name within another database.  The problem with this,
>> however, is that creating global roles would need conflict checks
>> against local roles in every database, unless a manifest of all local
>> roles were registered globally.
> 
> Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
> want local and global roles to have nonconflicting OIDs, else it's
> necessary to carry around an indication of which type each role is;
> which would be more or less a show-stopper in terms of the number of
> catalogs and internal APIs affected.  But I don't currently see any
> nice way to guarantee that if each database has a private table of
> local roles.

Maybe we could simply make all global role's OIDs even, and all local ones
odd, or something like that.

best regards,
Florian Pflug



Re: Changing the concept of a DATABASE

From
Susanne Ebrecht
Date:
Am 22.05.2012 17:42, schrieb Tom Lane:
> Encoding yes, but since 9.1 we have pretty fine-grained control of
> collation.  So I think this argument is a lot weaker than it used
> to be.  It would only really apply if you have one of the corner
> cases where utf8 doesn't work for you.

Yeah it got better - but it isn't perfect yet.

Maybe I am blind or 9.1 documentation has a bug - but according to the
documentation you can't change default collation per schema or per table.
You can set collation per column - but do you really want to set 
collation for
every single column of every single supported language in your 200+ tables
web tool?

That is a huge effort and a huge maintenance effort.

Usually you want to set the collation once per language schema. E.g. schema
russian gets Russian collation and schema British gets British collation 
and so on.

CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and
do it by creating a database.

I would like to get default collation per schema / table in 9.2 or 9.3 
but that is my personal
wish,

Susanne


-- 
Dipl. Inf. Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com



Re: Per-Database Roles

From
Thom Brown
Date:
On 22 May 2012 16:57, Florian Pflug <fgp@phlo.org> wrote:
> On May22, 2012, at 16:09 , Tom Lane wrote:
>> Thom Brown <thom@linux.com> writes:
>>> Conflicts would occur where localrolename matches an existing local
>>> role name within the same database, or a global role name, but not a
>>> local role name within another database.  The problem with this,
>>> however, is that creating global roles would need conflict checks
>>> against local roles in every database, unless a manifest of all local
>>> roles were registered globally.
>>
>> Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
>> want local and global roles to have nonconflicting OIDs, else it's
>> necessary to carry around an indication of which type each role is;
>> which would be more or less a show-stopper in terms of the number of
>> catalogs and internal APIs affected.  But I don't currently see any
>> nice way to guarantee that if each database has a private table of
>> local roles.
>
> Maybe we could simply make all global role's OIDs even, and all local ones
> odd, or something like that.

Wouldn't that instantly make all previous versions of database
clusters un-upgradable?

--
Thom


Re: Changing the concept of a DATABASE

From
Robert Haas
Date:
On Tue, May 22, 2012 at 12:00 PM, Susanne Ebrecht
<susanne@2ndquadrant.com> wrote:
> Usually you want to set the collation once per language schema. E.g. schema
> russian gets Russian collation and schema British gets British collation and
> so on.
>
> CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a level and
> do it by creating a database.
>
> I would like to get default collation per schema / table in 9.2 or 9.3 but
> that is my personal
> wish,

Interesting idea.  Sort of like ALTER DEFAULT PRIVILEGES.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Changing the concept of a DATABASE

From
Florian Pflug
Date:
On May22, 2012, at 18:00 , Susanne Ebrecht wrote:
> Am 22.05.2012 17:42, schrieb Tom Lane:
>> Encoding yes, but since 9.1 we have pretty fine-grained control of
>> collation.  So I think this argument is a lot weaker than it used
>> to be.  It would only really apply if you have one of the corner
>> cases where utf8 doesn't work for you.
>
> Yeah it got better - but it isn't perfect yet.

Still, the whole reason that the encoding is a per-database property is
that we don't tag each string with its encoding. To support cross-database
queries, we'd have to do that, and could then just as well make the encoding
a per-column property, or at least so I think.

> Maybe I am blind or 9.1 documentation has a bug - but according to the
> documentation you can't change default collation per schema or per table.
> You can set collation per column - but do you really want to set collation for
> every single column of every single supported language in your 200+ tables
> web tool?
>
> That is a huge effort and a huge maintenance effort.

You could always write at pl/pgsql function which iterates over all columns
of type text or varchar within a schema and sets the desired collation, but

> I would like to get default collation per schema / table in 9.2 or 9.3 but
> that is my personal wish,

yeah, that'd definitely be nice.

best regards,
Florian Pflug



Re: Per-Database Roles

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> On 22 May 2012 16:57, Florian Pflug <fgp@phlo.org> wrote:
>> Maybe we could simply make all global role's OIDs even, and all local ones
>> odd, or something like that.

> Wouldn't that instantly make all previous versions of database
> clusters un-upgradable?

IIRC, pg_upgrade doesn't need to force role OIDs to be the same in the
new cluster, so we could get away with this trick for the specific case
of roles.  It wouldn't work for introducing local/global versions of
some other types of objects though.

Another objection is that it wouldn't scale up nicely to multiple levels
of catalog hierarchy.  But maybe local/global is enough.
        regards, tom lane


Re: Per-Database Roles

From
Florian Pflug
Date:
On May22, 2012, at 18:03 , Thom Brown wrote:
> On 22 May 2012 16:57, Florian Pflug <fgp@phlo.org> wrote:
>> On May22, 2012, at 16:09 , Tom Lane wrote:
>>> Thom Brown <thom@linux.com> writes:
>>>> Conflicts would occur where localrolename matches an existing local
>>>> role name within the same database, or a global role name, but not a
>>>> local role name within another database.  The problem with this,
>>>> however, is that creating global roles would need conflict checks
>>>> against local roles in every database, unless a manifest of all local
>>>> roles were registered globally.
>>>
>>> Yeah.  The same type of issue arises for the roles' OIDs.  You'd really
>>> want local and global roles to have nonconflicting OIDs, else it's
>>> necessary to carry around an indication of which type each role is;
>>> which would be more or less a show-stopper in terms of the number of
>>> catalogs and internal APIs affected.  But I don't currently see any
>>> nice way to guarantee that if each database has a private table of
>>> local roles.
>>
>> Maybe we could simply make all global role's OIDs even, and all local ones
>> odd, or something like that.
>
> Wouldn't that instantly make all previous versions of database
> clusters un-upgradable?

Only if pg_upgrade needs to preserve the OIDs of roles. I kinda hoped it
wouldn't, because role OIDs aren't usually stored in non-system tables.

Hm… thinking about this further, it'd actually be sufficient for all newly
allocated role OIDs to follow the odd/even rule, if we additionally check
for conflicts with existing global role OIDs when allocating the OID of a new
local role. Which is much, much easier than checking for conflicts when
allocating a global OIDs, because for that you'd have to check against the
local role OIDs within *all* databases, not just against one shared table.

best regards,
Florian Pflug



Re: Changing the concept of a DATABASE

From
Josh Berkus
Date:
> 1. Ability to have a Role that can only access one Database
> 
> 2. Allow user info to be dumped with a database, to make a db
> completely self-consistent
> 
> 3. Allow databases to be transportable
> 
> 4. Allow users to access tables in >1 database easily, with appropriate rights.

The last requirement seems completely contradictory to the other three.Either we're trying to make databases even more
isolatedas
 
multi-tenant Catalogs, or we're not.  Trying to do both at the same time
is failure-by-design.

Given that we offer schema as an alternative to multiple databases, and
users are starting to get used to them, I think that requirement (4) is
just a bad idea, and not worth pursuing, except in the context of pgsql_fdw.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 22 May 2012 14:05, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, May 22, 2012 at 8:40 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> That seems to be leaving aside the fact that we don't currently have any
>> notion of how to allow FDWs to write the foreign tables.
>>
>> What is more, isn't the postgres FDW about talking to any postgres source?
>> If so, does it have special smarts for when we are talking to ourselves? And
>> if it doesn't then it seems unlikely to be an acceptable substitute for
>> allowing talking direct to a sibling database.
>>
>> I'm not at all yet sold on Simon's plan, but I'm skeptical that an FDW would
>> adequately meet the case if we wanted to go that way.
>
> Well, I don't think anyone is claiming that FDWs as they exist today
> solve all of the problems in this area.  But I think that improving
> FDWs is a more promising line of attack than trying to make backends
> talk to multiple databases.  Doing the latter will require massive
> surgery on the relcache, the catcache, most of the secondary catalog
> caches, the ProcArray, and every portion of the backend that thinks an
> OID uniquely identifies an SQL object.  Basically, they'd all need
> database OID as an additional key field, which is undesirable for
> performance reasons even if there were no issue of code churn.

Ack, part from the bit about OIDs no longer being unique. That might
be an upgrade issue but its obviously something we wouldn't allow if
we did that.

I'm not sure I see changing the caches as being massive surgery.
Perhaps we could just bypass them altogether.

> So I'd rather see us put the effort into pgsql_fdw, which, as Florian
> says, will also let us talk to a completely separate server.  If
> you've got multiple databases in the same cluster and really need to
> be doing queries across all of them, that's what schemas are
> supposedly for.  Now I know that people feel that doesn't work as well
> as it needs to, but there again I think it would be easier to fix
> schemas than to make cross-database queries work.

We're a very long way from making that work well. IMHO easily much
further than direct access.

If I have a customer with 1 database per user, how do they run a query
against 100 user tables? It would require 100 connections to the
database. Doing that would require roughly x100 the planning time and
x100 the connection delay. Larger SQL statements pass their results
between executor steps using libpq rather than direct calls.

I find it hard to believe FDWs will ever work sufficiently well to fix
those problems.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Per-Database Roles

From
Josh Berkus
Date:
Stephen,

> Which is "preferred"
> when you do a 'grant select' or 'grant role'?  

The local role is preferred, the same way we allow objects in the local
schema to overshadow objects in the global schema.

> Or do we just disallow
> overlaps between per-DB roles and global roles?  If we don't allow
> duplicates, I suspect a lot of the other questions suddenly become a lot
> easier to deal with, but would that be too much of a restriction?

The feature wouldn't be useful if we didn't allow conflicts between two
local role names.  However, we could prohibit conflicts between a local
role name and a global role name if it made the feature considerably
easier.  Users would find workarounds which weren't too arduous.

> How
> would you handle migrating an existing global role to a per-database
> role?

Again, I think it would be OK not handling it.  i.e., the user needs to
do the following:

1. create a new local role
2. reassign all the objects belonging to the global role to the local role
3. drop the global role
4. rename the local role

It'd be somewhat of a PITA, but I suspect that most people using the
"local roles" feature would recreate their databases from scratch
anyway.  And we could offer some sample scripts for the above on the
wiki and elsewhere.  Obviously, a more elegant migration command would
be ideal, but that could wait for the following PG release; we usually
follow the "make things possible first, and easy later" plan anyway.

Given that I'd love to have this feature, I'm trying to pare down its
requirements to a managable size.  Trying to do everything at once will
only result in the feature stalling until 10.5.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 22 May 2012 18:18, Josh Berkus <josh@agliodbs.com> wrote:
>
>> 1. Ability to have a Role that can only access one Database
>>
>> 2. Allow user info to be dumped with a database, to make a db
>> completely self-consistent
>>
>> 3. Allow databases to be transportable
>>
>> 4. Allow users to access tables in >1 database easily, with appropriate rights.
>
> The last requirement seems completely contradictory to the other three.
>  Either we're trying to make databases even more isolated as
> multi-tenant Catalogs, or we're not.  Trying to do both at the same time
> is failure-by-design.

Why is it OK to allow somebody to access multiple schema in one query,
but not multiple databases? Are you arguing that schemas are also
broken?

I see no failure by design. I see an idea for greater ease of use
being discussed.

> Given that we offer schema as an alternative to multiple databases, and
> users are starting to get used to them, I think that requirement (4) is
> just a bad idea, and not worth pursuing,

Personally, I have long recommended that people use schemas. But
people do use databases and when they do they are pretty much screwed.
I brought this up as a way of improving our ease of use.

> except in the context of pgsql_fdw.

That is not a realistic option.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Changing the concept of a DATABASE

From
Josh Berkus
Date:
> If I have a customer with 1 database per user, how do they run a query
> against 100 user tables? It would require 100 connections to the
> database. Doing that would require roughly x100 the planning time and
> x100 the connection delay. Larger SQL statements pass their results
> between executor steps using libpq rather than direct calls.

Why is this hypothetical customer using separate databases?  This really
seems like a case of "doctor, it hurts when I do this".

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 22 May 2012 18:35, Josh Berkus <josh@agliodbs.com> wrote:
>
>> If I have a customer with 1 database per user, how do they run a query
>> against 100 user tables? It would require 100 connections to the
>> database. Doing that would require roughly x100 the planning time and
>> x100 the connection delay. Larger SQL statements pass their results
>> between executor steps using libpq rather than direct calls.
>
> Why is this hypothetical customer using separate databases?  This really
> seems like a case of "doctor, it hurts when I do this".

Databases are great for separating things, but sometimes you want to
un-separate them in a practical way.

I'm surprised that you're so negative about an ease of use feature. I
had understood you cared about fixing problems experienced by our
developers.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Changing the concept of a DATABASE

From
Robert Haas
Date:
On Tue, May 22, 2012 at 1:27 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Ack, part from the bit about OIDs no longer being unique. That might
> be an upgrade issue but its obviously something we wouldn't allow if
> we did that.

And how exactly are you going to disallow that?  We currently enforce
the uniqueness of OIDs within a database by putting a unique index on
the relevant system catalog columns, but that clearly won't work as a
means of guaranteeing cross-database uniqueness.  Unless of course you
put all the catalog entries from all the databases into a single set
of catalogs; but then they're pretty much the same as the schemas we
already have.

> I'm not sure I see changing the caches as being massive surgery.
> Perhaps we could just bypass them altogether.

You're entitled to your opinion on this one, but we have those caches
for a very good reason: the system runs about 20,000 times slower
without them.

>> So I'd rather see us put the effort into pgsql_fdw, which, as Florian
>> says, will also let us talk to a completely separate server.  If
>> you've got multiple databases in the same cluster and really need to
>> be doing queries across all of them, that's what schemas are
>> supposedly for.  Now I know that people feel that doesn't work as well
>> as it needs to, but there again I think it would be easier to fix
>> schemas than to make cross-database queries work.
>
> We're a very long way from making that work well. IMHO easily much
> further than direct access.
>
> If I have a customer with 1 database per user, how do they run a query
> against 100 user tables? It would require 100 connections to the
> database. Doing that would require roughly x100 the planning time and
> x100 the connection delay. Larger SQL statements pass their results
> between executor steps using libpq rather than direct calls.
>
> I find it hard to believe FDWs will ever work sufficiently well to fix
> those problems.

If you have got a customer with one database per user, and yet you
want to run queries across all those tables, then you're using the
database system for something for which it is not designed, and it's
probably not going to work very well.  That sounds like a use case for
schemas, or maybe better, some kind of row-level security we don't
have yet - but hopefully will in 9.3, since KaiGai intends to work on
it.  Databases are designed to act as a firewall, so that somebody
using one database isn't affected by what happen in some other
database.  Unfortunately, because of shared catalogs, that's not
completely true, but that's the idea, and if anything we need to
isolate them more, not less.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Changing the concept of a DATABASE

From
Daniel Farina
Date:
On Tue, May 22, 2012 at 10:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 22 May 2012 18:35, Josh Berkus <josh@agliodbs.com> wrote:
>>
>>> If I have a customer with 1 database per user, how do they run a query
>>> against 100 user tables? It would require 100 connections to the
>>> database. Doing that would require roughly x100 the planning time and
>>> x100 the connection delay. Larger SQL statements pass their results
>>> between executor steps using libpq rather than direct calls.
>>
>> Why is this hypothetical customer using separate databases?  This really
>> seems like a case of "doctor, it hurts when I do this".
>
> Databases are great for separating things, but sometimes you want to
> un-separate them in a practical way.

In my experience, these un-separations are (thankfully) relieved of
the requirement of consistency between databases, and so the contract
is much more favorable.

The planning time problem is quite hard.

However, I think the connection-delay one is not as hard a one to
answer: I think multiplexed protocols are going to become the norm in
the near future (they have been a pretty uncontested part of the SPDY
protocol, for example, after flow control was added) and have a number
of useful properties, and it may be time to consider how we're going
to divorce the notion of one socket implies exactly one backend.

--
fdr


Re: Changing the concept of a DATABASE

From
Josh Berkus
Date:
> Why is it OK to allow somebody to access multiple schema in one query,
> but not multiple databases? Are you arguing that schemas are also
> broken?

Because the purpose of a database is to be a Catalog, i.e. an isolated
container, which is not the purpose of schemas.  To the extent to which
we can make the isolation a reality (instead of "lossy isolation" the
way it is now), we can enable many multitenant hosting designs which
aren't currently possible.  However, pursuing interdatabase queries at
the same time we try to add new isolation features is a doomed effort.

For example, if we created local database users (see other thread), then
what would happen if a local user tries to execute a cross-database
query?  If we enable physical migration of a database to a new Postgres
instance, what happens to standing multi-database views?  If
interdatabase queries are allowed, how do I, as a hosting operator, make
sure that users can't even see the other databases on the system?

> I see no failure by design. I see an idea for greater ease of use
> being discussed.

You can't attempt mutually contradictory requirements and expect to
succeed, or to improve ease of use.  You can't ride two horses,
especially if they're going in opposite directions.

> Personally, I have long recommended that people use schemas. But
> people do use databases and when they do they are pretty much screwed.
> I brought this up as a way of improving our ease of use.

I'm not arguing that we don't have users who would like interdatabase
queries, especially when they port applications from MySQL or MSSQL.  We
have a lot of such users.  However, we *also* have a lot of users who
would like to treat separate databases as virtual private instances of
Postgres, and there's no way to satisfy both goals. We have to choose
one route or the other.

I personally see the isolation case as the more necessary because there
are several workarounds for the "inter-database queries" issue, but
nothing for the "multitenant catalog" case.  Also, treating databases as
catalogs is more consistent with our historical approach, and will thus
break less backwards compatibility.

Maybe interdatabase queries are more useful/desired than catalog
features.  If that's the case, then we need to pursue them and abandon
efforts like per-database users. But we have to make a choice.

An alternative idea -- and one which could be deployed a lot faster --
is to come up with a tool which makes it easy to migrate an entire
database into a separate schema or set of schemas in an existing
database.   And improvements to manage schema visility/path better, I
suppose.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Changing the concept of a DATABASE

From
Daniel Farina
Date:
On Tue, May 22, 2012 at 10:56 AM, Josh Berkus <josh@agliodbs.com> wrote:
> I'm not arguing that we don't have users who would like interdatabase
> queries, especially when they port applications from MySQL or MSSQL.  We
> have a lot of such users.  However, we *also* have a lot of users who
> would like to treat separate databases as virtual private instances of
> Postgres, and there's no way to satisfy both goals. We have to choose
> one route or the other.

I think the idea that a physical machine where catalogs are physically
(shared-everything) co-located is one that will not stand for long as
part of a useful contract between a user and the database.  I'd really
like to avoid an extra tier of functionality that exists only for
databases that happen to land on the same physical machine.

I think any inter-database feature should work identically between two
databases across a network as two machines on one machine/cluster.

Transparent optimizations to deal with the special case of physical
co-location are not contrary to that contract, but I don't have a
sense of how important those optimizations would be before getting a
lot of the usability issues figured out.  Right now, it seems to me
that getting interdatabase usability feeling better is already pretty
hard.

--
fdr


Re: Per-Database Roles

From
Stephen Frost
Date:
* Josh Berkus (josh@agliodbs.com) wrote:
> The local role is preferred, the same way we allow objects in the local
> schema to overshadow objects in the global schema.

I would think we'd want the exact opposite.  I don't want my global
'postgres' user to be overwritten by some local one that the admin of
this particular DB created..

> The feature wouldn't be useful if we didn't allow conflicts between two
> local role names.  However, we could prohibit conflicts between a local
> role name and a global role name if it made the feature considerably
> easier.  Users would find workarounds which weren't too arduous.

Sorry, I was meaning between global space and local space.  Clearly we
must allow and handle cleanly overlaps between local spaces.

The issue with not allowing global spaces to overlap local ones is that
we'd have to check every local list when creating a global account;
that doesn't seem very easy to do.  On the flip side, allowing
duplicates between global and local would remove the need to check local
lists when creating global accounts, but would add complexity and could
lead to odd semantics when there is a duplicate.

> 1. create a new local role
> 2. reassign all the objects belonging to the global role to the local role
> 3. drop the global role
> 4. rename the local role

Right, that seems like it would work fine.

> It'd be somewhat of a PITA, but I suspect that most people using the
> "local roles" feature would recreate their databases from scratch
> anyway.  And we could offer some sample scripts for the above on the
> wiki and elsewhere.  Obviously, a more elegant migration command would
> be ideal, but that could wait for the following PG release; we usually
> follow the "make things possible first, and easy later" plan anyway.


Sure.

> Given that I'd love to have this feature, I'm trying to pare down its
> requirements to a managable size.  Trying to do everything at once will
> only result in the feature stalling until 10.5.

If you could help me work out the semantics and the high-level issues,
I'd love to spend time on this for 9.3...
Thanks,
    Stephen

Re: Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 22 May 2012 18:56, Josh Berkus <josh@agliodbs.com> wrote:

> I'm not arguing that we don't have users who would like interdatabase
> queries, especially when they port applications from MySQL or MSSQL.  We
> have a lot of such users.

Lots and lots, yes.

> However, we *also* have a lot of users who
> would like to treat separate databases as virtual private instances of
> Postgres, and there's no way to satisfy both goals. We have to choose
> one route or the other.

That's only true if you try to satisfy both goals at once, which I'm
not suggesting. So I believe that proposition to be false.

However, given sufficient people speaking against it, I'll leave this idea.

Though I'd suggest that people on this thread spend a little quality
time with FDWs. It's a great direction but there's a long way to go
yet. Sorry to Laurenz, who's done a great job so far on the Oracle
FDW.

> I personally see the isolation case as the more necessary because there
> are several workarounds for the "inter-database queries" issue

I also want that, per my original post.

> An alternative idea -- and one which could be deployed a lot faster --
> is to come up with a tool which makes it easy to migrate an entire
> database into a separate schema or set of schemas in an existing
> database.   And improvements to manage schema visility/path better, I
> suppose.

Yes, it is possible to improve things there also.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Per-Database Roles

From
Florian Pflug
Date:
On May22, 2012, at 22:35 , Stephen Frost wrote:
> * Josh Berkus (josh@agliodbs.com) wrote:
>> The local role is preferred, the same way we allow objects in the local
>> schema to overshadow objects in the global schema.
>
> I would think we'd want the exact opposite.  I don't want my global
> 'postgres' user to be overwritten by some local one that the admin of
> this particular DB created..

From a security POV, yup, you'd want global roles to take precedence.
But OTOH, you wouldn't want your application to suddenly break because
someone created a global role which shadows the local role you've created
a year ago.

Hm… thinking about this… No matter which takes precedence, we'd need some
way to explicitly specify global or local scope anyway. And we'd have to
prevent roles from being named in a way that conflicts with whatever
explicit specification we come up with, even if that causes pain for some
unlucky existing users. Avoiding these conflicts entirely is going to be
impossible, I fear, since we don't currently restrict role names in any way
AFAIK, and we store them in GUCs without any quoting.

So maybe we should just pick some qualification like prefixing local roles
with 'local.', forbid global roles from starting with 'local.', and be done
with it? Not the most elegant solution maybe, but it avoids surprises...

> The issue with not allowing global spaces to overlap local ones is that
> we'd have to check every local list when creating a global account;
> that doesn't seem very easy to do.

"Not very easy" is quite an understatement, I fear. "Very nearly impossible"
is more like it IMHO.

best regards,
Florian Pflug



Re: Per-Database Roles

From
Josh Berkus
Date:
> The issue with not allowing global spaces to overlap local ones is that
> we'd have to check every local list when creating a global account;
> that doesn't seem very easy to do.  On the flip side, allowing
> duplicates between global and local would remove the need to check local
> lists when creating global accounts, but would add complexity and could
> lead to odd semantics when there is a duplicate.

On the other hand, keep in mind that creating a global account can be
slow.  For anyone who has a huge multi-tenant setup with 200 database
each with their own local users, creating a new global account will be
an event which occurs once or twice a year.  Just so that we don't pay
the same check cost for people who don't use local accounts.

> If you could help me work out the semantics and the high-level issues,
> I'd love to spend time on this for 9.3...

Syntax seems simple: CREATE LOCAL ROLE ...

For that matter, let's keep other things simple:

1. local roles can inherit only from other local roles
2. global roles can inherit only from other global roles
3. only a global role can be a database owner


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Changing the concept of a DATABASE

From
Josh Berkus
Date:
> That's only true if you try to satisfy both goals at once, which I'm
> not suggesting. So I believe that proposition to be false.

Oh, ok.  Per your original email and follow-up arguments, you seemed to
be doing just that.

>> An alternative idea -- and one which could be deployed a lot faster --
>> is to come up with a tool which makes it easy to migrate an entire
>> database into a separate schema or set of schemas in an existing
>> database.   And improvements to manage schema visility/path better, I
>> suppose.
> 
> Yes, it is possible to improve things there also.

Feh, and nested schema, for that matter.  So, there's a fair bit of work
wherever we bite it off.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Per-Database Roles

From
Christopher Browne
Date:
On Tue, May 22, 2012 at 4:35 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Josh Berkus (josh@agliodbs.com) wrote:
>> The local role is preferred, the same way we allow objects in the local
>> schema to overshadow objects in the global schema.
>
> I would think we'd want the exact opposite.  I don't want my global
> 'postgres' user to be overwritten by some local one that the admin of
> this particular DB created..

In object-orientedness, the usual behaviour is for more specific
methods to override the more generic ones, which is reasonable.  I'm
not certain which direction is more to be preferred, whether:
a) To consider the global user as a default, to be overridden if possible, or
b) To consider the local user as the default, to be overridden if possible.
They're both tenable positions.

But I think I agree with Stephen, that what's desirable, with global
users, is to use them as the override.  They're gonna be expensive,
you should get something for the price :-).
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


Re: Per-Database Roles

From
Stephen Frost
Date:
On May 22, 2012, at 12:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Another objection is that it wouldn't scale up nicely to multiple levels
> of catalog hierarchy.  But maybe local/global is enough.

That would be a huge improvement and this wouldn't get in the way of
any solution to the global oid conflict issue coming along in the
future..

We still have the issue of name conflict between the global roles and
the local roles, right?  Unless we allow those to overlap and pick
whatever is "closest"? Then there is the question of role membership
and if we would allow that to go between local and global or what the
semantics of that would look like..

Thanks,

Stephen


Re: Per-Database Roles

From
Florian Pflug
Date:
On May23, 2012, at 00:21 , Josh Berkus wrote:
>> If you could help me work out the semantics and the high-level issues,
>> I'd love to spend time on this for 9.3...
> 
> Syntax seems simple: CREATE LOCAL ROLE ...
> 
> For that matter, let's keep other things simple:
> 
> 1. local roles can inherit only from other local roles
> 2. global roles can inherit only from other global roles
> 3. only a global role can be a database owner

+1 to (2) and (3).

If we do (1), we need to allow granting privileges on global objects
(think tablespaces) to local roles. Otherwise, how would you make a
tablespace usable to a certain local role? That, however, seems
difficult, because we wouldn't want local role OIDs to appear in
pg_tablespace's spacl column.

If we, OTOH, allowed local roles to inherit from global roles, we
could still forbit such GRANTs. One would then simply have to create a
global proxy role which has the necessary privileges, and which local roles
would inherit from instead of being granted the privileges directly.

Allowing (1) doesn't seem that hard, because it just requires us to
be able to store dependencies of local roles on global roles, which
pg_shdepend already (almost, we'd probably need to add deptype
SHARED_DEPENDENCY_MEMBER) support.

best regards,
Florian Pflug



Re: Changing the concept of a DATABASE

From
Peter Eisentraut
Date:
On tis, 2012-05-22 at 18:00 +0200, Susanne Ebrecht wrote:
> CREATE SCHEMA foo LC_COLLATE bar isn't supported so you went up a
> level and do it by creating a database.
> 
> I would like to get default collation per schema / table in 9.2 or 9.3
> but that is my personal wish,

Another way I've been thinking about is to have a session collation.
Then you don't have to decide about it when the tables are created, but
the application or user interface can decide it at run time without
having to hardcode it into all the queries.

This would effectively take the place of being able to say SET
lc_collate at run time.



Re: Per-Database Roles

From
Stephen Frost
Date:
On May 22, 2012, at 12:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Another objection is that it wouldn't scale up nicely to multiple levels
> of catalog hierarchy.  But maybe local/global is enough.

That would be a huge improvement and this wouldn't get in the way of
any solution to the global oid conflict issue coming along in the
future..

We still have the issue of name conflict between the global roles and
the local roles, right?  Unless we allow those to overlap and pick
whatever is "closest"? Then there is the question of role membership
and if we would allow that to go between local and global or what the
semantics of that would look like..

Thanks,

Stephen

Re: Changing the concept of a DATABASE

From
Josh Berkus
Date:
Simon,

> However, given sufficient people speaking against it, I'll leave this idea.

Well, I *will* point out that you have your work cut out for you on 9.3
already ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Changing the concept of a DATABASE

From
Simon Riggs
Date:
On 23 May 2012 21:15, Josh Berkus <josh@agliodbs.com> wrote:

>> However, given sufficient people speaking against it, I'll leave this idea.
>
> Well, I *will* point out that you have your work cut out for you on 9.3
> already ...

Yes, we do. It would be best to conclude that things I do on hackers
relate in some way to those goals, even if it isn't immediately clear
how.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Changing the concept of a DATABASE

From
"David E. Wheeler"
Date:
On May 23, 2012, at 1:55 PM, Simon Riggs wrote:

>> Well, I *will* point out that you have your work cut out for you on 9.3
>> already ...
>
> Yes, we do. It would be best to conclude that things I do on hackers
> relate in some way to those goals, even if it isn't immediately clear
> how.

Simon moves in mysterious ways…

David



Re: Changing the concept of a DATABASE

From
Susanne Ebrecht
Date:
Am 22.05.2012 15:27, schrieb Albe Laurenz:
> If you need different applications to routinely access each other's 
> tables, why not assign them to different schemas in one database?

I just saw another use case here.

There are lots of offices / departments creating maps. Topography maps,
pipeline maps, nature conservancy (e.g. where are the nests from endangered
birds?), mineral resources, wire maps, street maps, bicycle / jogging maps,
tourists maps, tree maps, cadastral land register, and so on.

All this departments have their own databases for their own maps.
They only map their own stuff.

Towns / states / regions have a department where all these maps get 
collected.

You can go to your town and ask for weird maps today - e.g. a map with 
all jogging
routes and waste water pipes but without autobahns.

You could say that you have one database per layer.

As I said - I saw this construction in real world outside. I am pretty 
sure that other
states maybe have other solutions but the described solution exist.

Susanne

-- 
Dipl. Inf. Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com



Re: Changing the concept of a DATABASE

From
Florian Pflug
Date:
On May24, 2012, at 11:39 , Susanne Ebrecht wrote:
> There are lots of offices / departments creating maps. Topography maps,
> pipeline maps, nature conservancy (e.g. where are the nests from endangered
> birds?), mineral resources, wire maps, street maps, bicycle / jogging maps,
> tourists maps, tree maps, cadastral land register, and so on.
>
> All this departments have their own databases for their own maps.
> They only map their own stuff.
>
> Towns / states / regions have a department where all these maps get collected.

The question is, how do they get collected? If they use some home-grown replication,
they might just as well collect them into schemas instead of databases. The same is
possible with slony, I think. And if they use WAL-based replication, they have no
choice but to collect them in different clusters, so cross-database queries within
a cluster wouldn't help.

I think that you're right that reporting would one of the main use-cases for cross-
database queries. But reporting is also, I think, one of the main uses-cases for
WAL-based replication. So having cross-database queries with don't allow queries
across multiple replicas will leave quite a few people out in the cold.

best regards,
Florian Pflug



Re: Changing the concept of a DATABASE

From
Josh Berkus
Date:
> Yes, we do. It would be best to conclude that things I do on hackers
> relate in some way to those goals, even if it isn't immediately clear
> how.

See, now you've got me all curious.  How does inter-DB queries relate to
the New Replication?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Changing the concept of a DATABASE

From
Andres Freund
Date:
On Thursday, May 24, 2012 08:12:56 PM Josh Berkus wrote:
> > Yes, we do. It would be best to conclude that things I do on hackers
> > relate in some way to those goals, even if it isn't immediately clear
> > how.
> See, now you've got me all curious.  How does inter-DB queries relate to
> the New Replication?
Being able to apply changes from one process to multiple databases reduces 
implementation/runtime overhead...

Andres


Re: Per-Database Roles

From
Bruce Momjian
Date:
On Tue, May 22, 2012 at 10:19:12AM -0400, Robert Haas wrote:
> In retrospect, I think the idea of shared catalogs was probably a bad
> idea.  I think we should have made roles and tablespaces database
> objects rather than shared objects, and come up with some ad-hoc
> method of representing the set of available databases.  But that
> decision seems to have been made sometime pre-1996, so the thought of
> changing it now is pretty painful, but I can dream...

Yes, pre-1996.  I think the fact that authentication/user names appear
in pg_hba.conf really locked the user name idea into global objects, and
we have never really been able to make a dent in that.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Per-Database Roles

From
Robert Haas
Date:
On Thu, May 24, 2012 at 6:21 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Tue, May 22, 2012 at 10:19:12AM -0400, Robert Haas wrote:
>> In retrospect, I think the idea of shared catalogs was probably a bad
>> idea.  I think we should have made roles and tablespaces database
>> objects rather than shared objects, and come up with some ad-hoc
>> method of representing the set of available databases.  But that
>> decision seems to have been made sometime pre-1996, so the thought of
>> changing it now is pretty painful, but I can dream...
>
> Yes, pre-1996.  I think the fact that authentication/user names appear
> in pg_hba.conf really locked the user name idea into global objects, and
> we have never really been able to make a dent in that.

Eh?  Why would the presence of usernames in pg_hba.conf mean that they
have to be global objects?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Per-Database Roles

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, May 24, 2012 at 6:21 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > Yes, pre-1996.  I think the fact that authentication/user names appear
> > in pg_hba.conf really locked the user name idea into global objects, and
> > we have never really been able to make a dent in that.
>
> Eh?  Why would the presence of usernames in pg_hba.conf mean that they
> have to be global objects?

I havn't had a chance (yet) to look, but perhaps the current code
attempts to validate the role before figuring out what database is being
requested?  We'd have to essentially invert that, of course, for this..
One thing I was wondering about is if we're going to have an issue
supporting things like "tell me what databases exist" (psql -l), which
connect to the 'postgres' by default, for local-only roles.  I'm not
sure that I actually care, to be honest, but it's something to consider.
I don't think we should require users to create every local role also in
postgres, nor do I feel that we should allow connections to postgres by
any role, nor do I want to break tools which use 'postgres' to basically
get access to shared catalogs- but I don't see an immediate or easy
solution..
Thanks,
    Stephen

Re: Per-Database Roles

From
Bruce Momjian
Date:
On Fri, May 25, 2012 at 10:34:54PM -0400, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > On Thu, May 24, 2012 at 6:21 PM, Bruce Momjian <bruce@momjian.us> wrote:
> > > Yes, pre-1996.  I think the fact that authentication/user names appear
> > > in pg_hba.conf really locked the user name idea into global objects, and
> > > we have never really been able to make a dent in that.
> > 
> > Eh?  Why would the presence of usernames in pg_hba.conf mean that they
> > have to be global objects?
> 
> I havn't had a chance (yet) to look, but perhaps the current code
> attempts to validate the role before figuring out what database is being
> requested?  We'd have to essentially invert that, of course, for this..
> One thing I was wondering about is if we're going to have an issue
> supporting things like "tell me what databases exist" (psql -l), which
> connect to the 'postgres' by default, for local-only roles.  I'm not
> sure that I actually care, to be honest, but it's something to consider.
> I don't think we should require users to create every local role also in
> postgres, nor do I feel that we should allow connections to postgres by
> any role, nor do I want to break tools which use 'postgres' to basically
> get access to shared catalogs- but I don't see an immediate or easy
> solution..

Yes.  In a simple case, you have a username, you want to validate it
against LDAP or kerberos --- how do you partition the external
authentication tool based on database name?  Seems like an obvious
problem to me.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Per-Database Roles

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> Eh?  Why would the presence of usernames in pg_hba.conf mean that they
>> have to be global objects?

> I havn't had a chance (yet) to look, but perhaps the current code
> attempts to validate the role before figuring out what database is being
> requested?  We'd have to essentially invert that, of course, for this..

Even more to the point, what do you do when the "database" column is
"all", or a list of more than one database name?

It's possible that we could define this away by saying that only
globally known usernames can be listed in pg_hba.conf, but I think
we'll still have implementation problems with doing authentication
for per-database usernames.
        regards, tom lane


Re: Per-Database Roles

From
Peter Eisentraut
Date:
On tis, 2012-05-22 at 10:19 -0400, Robert Haas wrote:
> I think we should have made roles and tablespaces database
> objects rather than shared objects, 

User names are global objects in the SQL standard, which is part of the
reason that the current setup was never seriously challenged.



Re: Per-Database Roles

From
Robert Haas
Date:
On Sun, May 27, 2012 at 2:53 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On tis, 2012-05-22 at 10:19 -0400, Robert Haas wrote:
>> I think we should have made roles and tablespaces database
>> objects rather than shared objects,
>
> User names are global objects in the SQL standard, which is part of the
> reason that the current setup was never seriously challenged.

Does the SQL standard really discriminate between the database and the
cluster?  Wow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Per-Database Roles

From
Robert Haas
Date:
On Fri, May 25, 2012 at 11:12 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Fri, May 25, 2012 at 10:34:54PM -0400, Stephen Frost wrote:
>> * Robert Haas (robertmhaas@gmail.com) wrote:
>> > On Thu, May 24, 2012 at 6:21 PM, Bruce Momjian <bruce@momjian.us> wrote:
>> > > Yes, pre-1996.  I think the fact that authentication/user names appear
>> > > in pg_hba.conf really locked the user name idea into global objects, and
>> > > we have never really been able to make a dent in that.
>> >
>> > Eh?  Why would the presence of usernames in pg_hba.conf mean that they
>> > have to be global objects?
>>
>> I havn't had a chance (yet) to look, but perhaps the current code
>> attempts to validate the role before figuring out what database is being
>> requested?  We'd have to essentially invert that, of course, for this..
>> One thing I was wondering about is if we're going to have an issue
>> supporting things like "tell me what databases exist" (psql -l), which
>> connect to the 'postgres' by default, for local-only roles.  I'm not
>> sure that I actually care, to be honest, but it's something to consider.
>> I don't think we should require users to create every local role also in
>> postgres, nor do I feel that we should allow connections to postgres by
>> any role, nor do I want to break tools which use 'postgres' to basically
>> get access to shared catalogs- but I don't see an immediate or easy
>> solution..
>
> Yes.  In a simple case, you have a username, you want to validate it
> against LDAP or kerberos --- how do you partition the external
> authentication tool based on database name?  Seems like an obvious
> problem to me.

Well, when people try to connect to database "it", you set up
pg_hba.conf to point them at the Kerberos server, but when they try to
connect to database "sales", you just use MD5 for that.  Or whatever
your site policy happens to be.  I'm not seeing the problem;
pg_hba.conf already allows different authentication methods for
different databases.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company