Re: 8.4 release planning - Mailing list pgsql-hackers

From Andrew Sullivan
Subject Re: 8.4 release planning
Date
Msg-id 20090128142729.GA36064@shinkuro.com
Whole thread Raw
In response to Re: 8.4 release planning  (KaiGai Kohei <kaigai@ak.jp.nec.com>)
List pgsql-hackers
On Wed, Jan 28, 2009 at 11:31:25AM +0900, KaiGai Kohei wrote:

> As I noted before, there is a symmetrical structure between
> OS and DBMS.

Well, you said that before.  I think your analogy is awful.  I don't
think the similarities are nearly as great as you think, and I also
think there are significant differences that _make_ the difference in
these cases.  In particular,

> In operating system, a process accesses objects (like file,
> socket, ...) managed by operating system via system calls.
> Its security system (filesystem permission, SELinux, ...)
> acquires the request and applies its access control rules.
>
> In DBMS, a client accesses database objects managed by DBMS
> via SQL queries. Its security system (Database ACL,
> SE-PostgreSQL, ...) aquires the request and applies its access
> control rules.

the difference here is that in the OS, a process accessing the object
has few to no guarantees about concurrency.  In RDBMS, a very
significant reason even to use the DBMS is the ACID guarantees, which
make a number of claims about concurrency that simply aren't there in
most filesystems.  It's at exactly this architectural point that most
of the in-principle design questions have been aimed.  My personal
view is that those questions haven't really been answered very well,
but as I've already said I mostly stopped paying attention to this
work several months ago; so maybe I overlooked something.  I note that
Peter and Bruce seem to have been satisfied, so maybe they understood
something I don't (that's quite likely).

> The most significant feature is centralized access control policy
> between OS and DBMS. 

Right, I get that; but all the discussion I've seen on this suggest
that, to get the benefit of the centralised access control, I trade
away certain well-understood assumptions of a relational environment,
but without much indication that I've done so.

> I talked here we should consider the value of information asset
> is independent from the way to store them.

Yes, I know that was your premise.  I am not entirely sure I agree
with it, is the thing.

> Needless to say, the value of information asset is decided by its
> contents. 

Nonsense.  The value of an information asset is determined only partly
by its contents.  I'd argue that the value of an information asset is
a function of its use-value.  If the information asset is completely
unusable, then it isn't worth anything at all.  

> If your credit card number is recorded on a paper,
> do you think it has lesser value than recorded on database?

Yes.  The database makes the credit card number available to other
applications, which can then use that data to charge the credit card
with other purchases.  For me, therefore, the piece of paper,
correctly handled, imposes less risk than the database; in addition,
the piece of paper offers a smaller advantage, because it cannot be
leveraged to make other interactions more convenient.  Finally, the
piece of paper offers a different kind of risk, because if it is
mishandled and then becomes the basis on which the number ends up in a
database, I have a new problem for which I was not prepared.

I believe my fundamental objection was that, as far as I was able to
tell, SELinux simply didn't have anything useful to say about
concurrent actions on data under SE controls; that's because it was
aimed at a fairly primitive database (a filesystem) without the rich
concurrency support of RDBMS.  I still don't see anywhere in your
discussions an extension of the SELinux model to account for that
concurrency richness, so I think there's something wrong with the
principles from which your're starting.  I'm totally prepared to admit
I've missed something, however.  Also, since this isn't really my
problem any more, I'm unlikely to spend much time reading more design
notes or anything of the sort.  

Finally,

> It finally enables to apply centralized access control policy on
> whole of application stack.
> Please note that 95% of attacks in 2008 targeted to web system,
> so it gives a nightmare for security folks.

this argument gets to the heart of what you seem to want, which is a
centralized system that guarantees the controls you want.  I'm
actually dubious that such centralization is actually the benefit that
its proponents seem to think it is; but if it is, then the centralised
system needs to be exactly as rich as the richest system under
control.  By starting with SELinux, I argue that the approach starts
with a too-poor model.  (See above.)  

More fundamentally, the premise that the database is just a part of an
"application stack" is, in my view, exactly _why_ these systems are so
vulnerable to attack.  Database management systems are not designed to
be dumb storage for a single application, and they're actually very
poorly adapted to such a role.  My impression is that SEPostgres is an
attempt to finally force the database system under such controls, as
though it were a glorified filesystem.  I have no idea whether it will
work; but to my way of thinking, it's a mindset foreign to the
principles of RDBM system design.  That could be why some of us react
to the proposal with perplexed looks.

A

-- 
Andrew Sullivan
ajs@crankycanuck.ca


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Hot standby, recovery infra
Next
From: Stephen Frost
Date:
Subject: Re: How to get SE-PostgreSQL acceptable