Thread: Limit on number of users in postgresql?
I want to make a client application that logs onto a postgresql server with an individual postgresql user role for each user. This is as opposed to having one user role for the entire app, storing that password on the local client, and using the client app to do authentication against a a table in the connected database. I'm wondering if there's a manageable limit on the number of users in postgresql. I could have conceivably thousands of different users for any particular app and multiple apps in the same server. I would distinguish between different users for different apps by something like a user encoded name myapp_myname where to the individual app that would be user myname. This seems like a persistent problem for a lot of database apps. Is there any simple way of dealing with this problem? A lot of apps store root passwords to the database locally, which is very insecure, especially on desktop applications. It seems like a big headache to declare users globally to the server instead of at the database level. Am I doing more work than I need to do? Thanks.
Mark Walker <furface@omnicode.com> writes: > I'm wondering if there's a manageable limit on the number of users in > postgresql. I could have conceivably thousands of different users for > any particular app and multiple apps in the same server. Numbers in the low thousands are probably workable, as long as you aren't adding and removing users at a great rate. I think that the main bottleneck would be the "flat file" that's used to tell the postmaster about the set of valid users --- every time a user is added/dropped/changed, that file gets rewritten and then re-parsed by the postmaster. So you could eat a lot of overhead if you change users every few seconds or something like that. I'd suggest doing some testing to see if there are any unexpected bottlenecks, but I don't see a reason to reject the idea out of hand. (If you find any slownesses, report 'em, and we might be able to fix 'em.) regards, tom lane
Thanks Tom. You know I thought about this approach a little more. I don't think there's a simple answer to this security problem short of placing a proxy server application between the clients and the database. The problem with giving database role accounts to each and every user is that the users now have uncontrolled access to the database. That's not good either because I want to do fine grained access control within my application. The more I think about it, the more I think a proxy app is necessary. It seems like a lot of work just for security issues, but basically most web based database apps use this model, with the web application acting as a proxy between the database and the client. Thanks. Tom Lane wrote: > Mark Walker <furface@omnicode.com> writes: > >> I'm wondering if there's a manageable limit on the number of users in >> postgresql. I could have conceivably thousands of different users for >> any particular app and multiple apps in the same server. >> > > Numbers in the low thousands are probably workable, as long as you > aren't adding and removing users at a great rate. I think that the > main bottleneck would be the "flat file" that's used to tell the > postmaster about the set of valid users --- every time a user is > added/dropped/changed, that file gets rewritten and then re-parsed > by the postmaster. So you could eat a lot of overhead if you change > users every few seconds or something like that. > > I'd suggest doing some testing to see if there are any unexpected > bottlenecks, but I don't see a reason to reject the idea out of hand. > (If you find any slownesses, report 'em, and we might be able to fix > 'em.) > > regards, tom lane > >
Furface <furface@omnicode.com> wrote: > > Thanks Tom. You know I thought about this approach a little more. I > don't think there's a simple answer to this security problem short of > placing a proxy server application between the clients and the > database. The problem with giving database role accounts to each and > every user is that the users now have uncontrolled access to the > database. Ummm ... huh? PostgreSQL has a pretty nice security model that gives you a great deal of control over what users have access to: http://www.postgresql.org/docs/8.2/static/user-manag.html The only thing that's missing is row-level granularity. There's at least one project out there supporting that, and you can also simulate it with clever usage of stored procedures and the ability to run them with the permissions of the definer instead of the executer. -Bill
OK, let me think. In my situation, I'm writing an accounting app. A typical situation would be a standard user would be able to update data in a timesheet while an administrator would be able to approve the time sheet. If I gave the standard user access to the timesheet header table, they would also have access to an approval field, so I'd need something like "column" level security. I could put in a trigger where I check the permissions of someone attempting to set the approval column. Yes, I think that would probably work. I think you're correct. I can do pretty much what I want without developing a proxy server. That's what I think I'll do, my original plan of adding users with different roles. It still gives me the creeps, allowing lots of people direct access to my server. Thanks. Bill Moran wrote: > Furface <furface@omnicode.com> wrote: > >> Thanks Tom. You know I thought about this approach a little more. I >> don't think there's a simple answer to this security problem short of >> placing a proxy server application between the clients and the >> database. The problem with giving database role accounts to each and >> every user is that the users now have uncontrolled access to the >> database. >> > > Ummm ... huh? > > PostgreSQL has a pretty nice security model that gives you a great deal > of control over what users have access to: > http://www.postgresql.org/docs/8.2/static/user-manag.html > > The only thing that's missing is row-level granularity. There's at least > one project out there supporting that, and you can also simulate it with > clever usage of stored procedures and the ability to run them with the > permissions of the definer instead of the executer. > > -Bill > >
On Sun, Jan 28, 2007 at 01:21:09PM -0500, Bill Moran wrote: > The only thing that's missing is row-level granularity. There's at least > one project out there supporting that, and you can also simulate it with > clever usage of stored procedures and the ability to run them with the > permissions of the definer instead of the executer. You can also use rules to protect rows. E.g. CREATE RULE atable__lock_user_insert AS ON INSERT TO atable WHERE CURRENT_USER != 'mysuper' AND new.username != CURRENT_USER DO INSTEAD nothing; CREATE RULE atable__lock_user_update AS ON UPDATE TO atable WHERE CURRENT_USER != 'mysuper' AND old.username != CURRENT_USER DO INSTEAD nothing; CREATE RULE atable__lock_user_delete AS ON DELETE TO atable WHERE CURRENT_USER != 'mysuper' AND old.username != CURRENT_USER DO INSTEAD nothing; -- Ron Peterson https://www.yellowbank.com/
Tom Lane wrote:
>I think that the
>main bottleneck would be the "flat file" that's used to tell the
>postmaster about the set of valid users --- every time a user is
>added/dropped/changed, that file gets rewritten and then re-parsed
>by the postmaster. So you could eat a lot of overhead if you change
>users every few seconds or something like that.
I'm developing the same kind of application right now. We've been successful in programming per-user row-level security, we're quite happy with it. Even if someone should crack our web-server he still could not do anything much with the database, which holds what really matters to us.
I've heard about the performance drawbacks, but never the exact reason for it.
What you describe Tom (flat file), sounds a bit strange to me. Aren't users stored in a table? (pg_catalog.pg_authid)
I guess maybe those (system) tables are stored differently than normal ones?
Willy-Bas
>I think that the
>main bottleneck would be the "flat file" that's used to tell the
>postmaster about the set of valid users --- every time a user is
>added/dropped/changed, that file gets rewritten and then re-parsed
>by the postmaster. So you could eat a lot of overhead if you change
>users every few seconds or something like that.
I'm developing the same kind of application right now. We've been successful in programming per-user row-level security, we're quite happy with it. Even if someone should crack our web-server he still could not do anything much with the database, which holds what really matters to us.
I've heard about the performance drawbacks, but never the exact reason for it.
What you describe Tom (flat file), sounds a bit strange to me. Aren't users stored in a table? (pg_catalog.pg_authid)
I guess maybe those (system) tables are stored differently than normal ones?
Willy-Bas
On second thought, you might be referring to pg_hba.conf?
That would not be applicable for a web-service, since all users would connect through the same IP-adress..
That would not be applicable for a web-service, since all users would connect through the same IP-adress..
On 1/29/07, Willy-Bas Loos <willybas@gmail.com> wrote:
Tom Lane wrote:
>I think that the
>main bottleneck would be the "flat file" that's used to tell the
>postmaster about the set of valid users --- every time a user is
>added/dropped/changed, that file gets rewritten and then re-parsed
>by the postmaster. So you could eat a lot of overhead if you change
>users every few seconds or something like that.
I'm developing the same kind of application right now. We've been successful in programming per-user row-level security, we're quite happy with it. Even if someone should crack our web-server he still could not do anything much with the database, which holds what really matters to us.
I've heard about the performance drawbacks, but never the exact reason for it.
What you describe Tom (flat file), sounds a bit strange to me. Aren't users stored in a table? (pg_catalog.pg_authid)
I guess maybe those (system) tables are stored differently than normal ones?
Willy-Bas
"Willy-Bas Loos" <willybas@gmail.com> writes: > Tom Lane wrote: >> I think that the >> main bottleneck would be the "flat file" that's used to tell the >> postmaster about the set of valid users --- every time a user is >> added/dropped/changed, that file gets rewritten and then re-parsed >> by the postmaster. So you could eat a lot of overhead if you change >> users every few seconds or something like that. > What you describe Tom (flat file), sounds a bit strange to me. Aren't users > stored in a table? (pg_catalog.pg_authid) Yeah, but the postmaster can't read pg_authid, nor any other table, because it's not logically connected to the database. So any change to pg_authid gets copied to a "flat" ASCII-text file for the postmaster. There are a number of reasons for this arrangement, but they really boil down to keeping the postmaster at arm's length from the backends. If a backend goes nuts and scribbles all over shared memory before crashing, it may well manage to take down some other backends with it, but we don't want any risk that it'll take down the postmaster and thereby prevent automatic restart. Hence the postmaster is designed never to depend on valid contents in shared memory, and that means it can't do much of anything in the way of database access. This is also the reason that pg_hba.conf and friends are not tables as you might otherwise expect. regards, tom lane
In response to Tom Lane <tgl@sss.pgh.pa.us>: > "Willy-Bas Loos" <willybas@gmail.com> writes: > > Tom Lane wrote: > >> I think that the > >> main bottleneck would be the "flat file" that's used to tell the > >> postmaster about the set of valid users --- every time a user is > >> added/dropped/changed, that file gets rewritten and then re-parsed > >> by the postmaster. So you could eat a lot of overhead if you change > >> users every few seconds or something like that. > > > What you describe Tom (flat file), sounds a bit strange to me. Aren't users > > stored in a table? (pg_catalog.pg_authid) > > Yeah, but the postmaster can't read pg_authid, nor any other table, > because it's not logically connected to the database. So any change > to pg_authid gets copied to a "flat" ASCII-text file for the postmaster. Would using kerberos or some other external auth mechanism work around this? -- Bill Moran Collaborative Fusion Inc.
Bill Moran <wmoran@collaborativefusion.com> writes: > In response to Tom Lane <tgl@sss.pgh.pa.us>: >> Yeah, but the postmaster can't read pg_authid, nor any other table, >> because it's not logically connected to the database. So any change >> to pg_authid gets copied to a "flat" ASCII-text file for the postmaster. > Would using kerberos or some other external auth mechanism work around this? Kerberos can't read the database directly either, so I'm not sure I see your point. regards, tom lane
In response to Tom Lane <tgl@sss.pgh.pa.us>: > Bill Moran <wmoran@collaborativefusion.com> writes: > > In response to Tom Lane <tgl@sss.pgh.pa.us>: > >> Yeah, but the postmaster can't read pg_authid, nor any other table, > >> because it's not logically connected to the database. So any change > >> to pg_authid gets copied to a "flat" ASCII-text file for the postmaster. > > > Would using kerberos or some other external auth mechanism work around this? > > Kerberos can't read the database directly either, so I'm not sure I see > your point. It's possible that I'm misunderstanding. If there's a problem with having large numbers of users in Postgres because the postmaster has to use a flat file to store them, can one circumvent the issue by configuring Postgres to use kerberos for auth instead of its internal mechanisms? Will this eliminate the need for the flat file? -- Bill Moran Collaborative Fusion Inc.
Bill Moran wrote: > In response to Tom Lane <tgl@sss.pgh.pa.us>: > > > Bill Moran <wmoran@collaborativefusion.com> writes: > > > In response to Tom Lane <tgl@sss.pgh.pa.us>: > > >> Yeah, but the postmaster can't read pg_authid, nor any other table, > > >> because it's not logically connected to the database. So any change > > >> to pg_authid gets copied to a "flat" ASCII-text file for the postmaster. > > > > > Would using kerberos or some other external auth mechanism work around this? > > > > Kerberos can't read the database directly either, so I'm not sure I see > > your point. > > It's possible that I'm misunderstanding. > > If there's a problem with having large numbers of users in Postgres because > the postmaster has to use a flat file to store them, can one circumvent the > issue by configuring Postgres to use kerberos for auth instead of its > internal mechanisms? Will this eliminate the need for the flat file? No, because Postgres needs to check that the user is present in the internal catalogs anyway. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Tom Lane wrote: >> What you describe Tom (flat file), sounds a bit strange to me. >> Aren't users >> stored in a table? (pg_catalog.pg_authid) > > Yeah, but the postmaster can't read pg_authid, nor any other table, > because it's not logically connected to the database. So any change > to pg_authid gets copied to a "flat" ASCII-text file for the > postmaster. Why doesn't the postmaster read the db files directly, presumably using some of the same code the backends do, or is too hard to bypass the shared memory layer? Another thing you folks must have considered would be to keep the out-of-memory copies of this kind of data in something faster than a flat file - say Berkeley DB. Do either of these things make sense? - John D. Burger MITRE
"John D. Burger" <john@mitre.org> writes: > Why doesn't the postmaster read the db files directly, presumably > using some of the same code the backends do, or is too hard to bypass > the shared memory layer? It's not "too hard", it's simply wrong. The copy on disk may be out of date due to not having been flushed from shared buffers yet. Moreover, without any locking you can't ensure you get a consistent view of the data. > Another thing you folks must have > considered would be to keep the out-of-memory copies of this kind of > data in something faster than a flat file - say Berkeley DB. Do > either of these things make sense? If I were going to do anything about this, I'd think about teaching the postmaster about some kind of incremental-update protocol instead of rereading the whole flat file every time. The issue with any such idea is that it pushes complexity, and therefore risk of bugs, into the postmaster which is exactly where we can't afford bugs. Given the lack of actual performance complaints from the field so far, I'm not inclined to do anything for now ... regards, tom lane
> The more I think about it, the more I think a proxy app is necessary. > It seems like a lot of work just for security issues, but basically most > web based database apps use this model, with the web application acting > as a proxy between the database and the client. This is how I've seen it done on almost every application I've worked on. If you have multiple apps hitting a single DB, usually each application has it's own role. But user-level security is controlled at the application-level. Although I don't think there's anything *wrong* with having a role-per-user (it could provide an "extra" layer of security), I think it's much more flexible to define security in the application/business logic layer. That being said, we shouldn't get too wound up over this "limitation" of PostgreSQL until someone finds that there really is some real-world performance issue. AFAIK, everything in this thread is theoretical. Cheers, Jeremy Haile
John D. Burger wrote: > Tom Lane wrote: > > >>What you describe Tom (flat file), sounds a bit strange to me. > >>Aren't users > >>stored in a table? (pg_catalog.pg_authid) > > > >Yeah, but the postmaster can't read pg_authid, nor any other table, > >because it's not logically connected to the database. So any change > >to pg_authid gets copied to a "flat" ASCII-text file for the > >postmaster. > > Why doesn't the postmaster read the db files directly, presumably > using some of the same code the backends do, or is too hard to bypass > the shared memory layer? We used to do that, and we're lucky that it was replaced by the current plaintext file mechanism, because it was ugly and very limited. And probably slow too. > Another thing you folks must have > considered would be to keep the out-of-memory copies of this kind of > data in something faster than a flat file - say Berkeley DB. Do > either of these things make sense? I don't know a lot about BerkeleyDB, but I'm sure that if plain files were proven to be a bottleneck we could hack some sort of indexed storage. Making BDB or something similar a hard requirement is certainly out of the question. On the other hand, writing such a thing would probably be more costly than writing a plaintext file ... -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
I had some ideas about this slow flat file issue, but it's apparently not yet much of an issue, in fact ... Someone talked about the postmaster having to be "at arms' length" from the actual tables. But I was looking at the postmaster code, and it seems to fork a new backend as soon as select indicates there's a new request, without checking authorization or anything first. So why can't the new backend query the user table as normal? Even if shared buffers are trashed, that'll only take down the new backend, not the original postmaster, right? Enlightenment welcome, or redirection to -developers if appropriate. - John Burger MITRE
"John D. Burger" <john@mitre.org> writes: > Someone talked about the postmaster having to be "at arms' length" > from the actual tables. But I was looking at the postmaster code, > and it seems to fork a new backend as soon as select indicates > there's a new request, without checking authorization or anything > first. So why can't the new backend query the user table as normal? > Even if shared buffers are trashed, that'll only take down the new > backend, not the original postmaster, right? Well, the current design is that the postmaster has a copy of the flat file in memory, which is inherited by backends across the fork(), and so they can query it at small cost. For the normal situation where the list of users isn't changing very rapidly, this is probably faster than your suggestion would be. The issue to make it work like you suggest is that there's not presently any way to query tables until *after* you have selected and connected to a particular database. So you'd still need a flat file representing pg_database, plus it's unclear how you'd implement some of pg_hba.conf's options like "samegroup". Since the tables you need to touch are all shared, it's conceivable that this could be hacked around, but it seems awfully messy. Another consideration is that this'd significantly increase the amount of work done before validating that the connection request is authorized, thus perhaps making it easier to mount a DOS attack on the postmaster (though admittedly a blackhat with access to the postmaster port can probably cause trouble anyway just from the "fork bomb" effect). regards, tom lane
Tom Lane wrote: > Since the tables you need to touch are all shared, it's conceivable > that > this could be hacked around, but it seems awfully messy. Another > consideration is that this'd significantly increase the amount of work > done before validating that the connection request is authorized, Yah, I thought the chicken-and-egg issue might be the nub of it - it sure feels more secure to avoid accessing any tables at all 'til you know what kind of access the user has. - John Burger MITRE