Re: Updates of SE-PostgreSQL 8.4devel patches (r1197) - Mailing list pgsql-hackers

On Fri, 2008-11-07 at 18:20 +0900, KaiGai Kohei wrote:
> I updated the patch set of SE-PostgreSQL (revision 1197).
> 
> [1/6] http://sepgsql.googlecode.com/files/sepostgresql-sepgsql-8.4devel-3-r1197.patch
> [2/6] http://sepgsql.googlecode.com/files/sepostgresql-pg_dump-8.4devel-3-r1197.patch
> [3/6] http://sepgsql.googlecode.com/files/sepostgresql-policy-8.4devel-3-r1197.patch
> [4/6] http://sepgsql.googlecode.com/files/sepostgresql-docs-8.4devel-3-r1197.patch
> [5/6] http://sepgsql.googlecode.com/files/sepostgresql-tests-8.4devel-3-r1197.patch
> [6/6] http://sepgsql.googlecode.com/files/sepostgresql-row_acl-8.4devel-3-r1197.patch
> 
> List of updates:
> - Patches are rebased to the latest CVS HEAD.
> - The "NSA SELinux" comments are replaced by the simple "SELinux".
> - bugfix: system attributes are preserved prior to invocation of
>   before-row-update-triggers.
> - Add a documentation of row-level database acl, within 6th patch.
>   Most of patch size increments come from the new documentation.
> 
> Draft of the SE-PostgreSQL documentation is here:
>   http://wiki.postgresql.org/wiki/SEPostgreSQL
> 
> By the way, now a week has gone since beginning of the final CommitFest,
> however, the reviewers field on the wiki is empty.
> Is there anyone who can help to review the patches?


Some initial thoughts based upon reading the Wiki. I've not been
involved in things up to now, so if this dredges up old discussions,
well, these are my thoughts.

I want SEPostgreSQL, but I'd like it to work without needing to be a
compile time option so people can just use it as/when needed. Plus we
don't want to support what would be/is essentially a fork of Postgres.
Most CPUs will optimise away a simple if-test in various places.

Some users will be able to take advantage of the facilities without
implementing full MLS. Yet we want the full facilities for Government.
Many people currently run multiple customers in different schemas,
though this would let them just run a single set of tables so is much
better for running many small customers.

I'm very unhappy about putting a nonoptional value on tuple headers,
especially because it is updatable. Do we expect MVCC will work with
updatable security contexts? i.e. when you update the security context
of a tuple that won't effect its visibility to existing system users. I
can't imagine you'd want that would you? It's kind of difficult to *not*
get it though.

Looks to me that this feature is useless without things working at row
level. So we can't leave that part out.

The security context on each row could be an optional column present
only if HEAP_HASSECURITYCONTEXT is set (0x0010 see htup.h), just like
OIDs. Use a specific datatype rather than TEXT. That datatype could be
an identifier to pg_security. Security people have big databases too, so
we need to compress the security context more and take out parse time of
string handling. Don't think we should use Oids, they're too big. Might
be easier to use a 2byte field and restrict access to 32,000 contexts,
which is easily enough. TEXT also makes me nervous, just in case there
is some collation/encoding weirdness that allows contexts to be
subverted. Fixed integers are hard to compromise in that respect.

How will unique indexes work? Do you implicitly add security context as
last column on every unique index, or does the uniqueness violation only
occurs within security contexts, or does the uniqueness violation tested
against all contextx that the inserter can currently see? Is there a
change to system catalogs? 

Foreign Key deletions could be handled correctly if you treat them as
updates. If we have the following example

TableA
security_context=y value=2 fk=1

TableB
security_context=x value=1

TableA refers to TableB. Context x cannot see context y.

So if somebody with context x tries to delete value1 from TableB, they
will be refused because of a row they cannot see. In this case the
correct action is to update the tuple in TableB so it now has a
security_context = y. The user with x cannot see it and can be persuaded
he deleted it, while the user with y can still see it.

The section on LOAD doesn't sound very convincing. Loading a module
could immediately subvert security. We could probably tighten up on that
for general use as well. ISTM we need something like a new catalog table
for loadable modules, so we can give them specific security contexts and
potentially store some kind of verification information about them.
Having a single "can load modules" isn't enough with Postgres, since it
would effectively prevent us from loading *any* modules in a secure
database. Which kinda removes much of the benefit of using Postgres.

Is there an issue with relation cache and catalog caches? ISTM that you
should put a security context onto each shared invalidation message, so
that backends know not to maintain caches for data they aren't allowed
to see. Probably overkill, just thinking.

The interaction with SELinux should not be hardcoded. I think we need
some form of plugin/wrapper to allow other systems to work. Not sure
what those are, but this isn't a Linux only project. We want to give
everybody the ability to work with PostgreSQL.

How does Discretionary Access control work with regard to server logs?
You said there was no superuser access, but I don't see any controls for
the logs. Do we need log_max_security_context?

Trusted procedures seem very similar to SECURITY DEFINER functions. Can
you explain what the differences are? I'm sure we don't want to similar
features.

I don't see any reason for the "=" in most of the new DDL syntax. Just
seems superfluous and out of character with normal SQL DDL. 

With DDL we already have table options, so why not include
security_context as an option? If we are adding this to databases as
well, it seems a good time to include a generic database-level options
facility and make "security_context" the first option.

The parameter to enable this facility should be something likeenhanced_security = on

My feeling is that if you want to include these features in core
PostgreSQL you won't be able to maintain the "separate branding" of
SEPostgreSQL, logo etc.. Maybe I'll get used to it. But if we are going
to use it, we should say SEPostgresSQL, not SE-PostgreSQL if we are
saying SELinux.

It would make sense for you to look at the work on replication also. Not
much use having SEPostgreSQL if we can't replicate it securely. And
possibly in-place update so that respects label security.

I've not looked at the code and probably won't have time to do that. But
if I did, I'd say "diff -c". 

Good, thorough work.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Ron Mayer
Date:
Subject: Re: Patch for ISO-8601-Interval Input and output.
Next
From: Alvaro Herrera
Date:
Subject: restore PD_PAGE_FULL on WAL update replay