Thread: function body actors (was: [PERFORM] viewing source code)

function body actors (was: [PERFORM] viewing source code)

From

"Merlin Moncure"

Date:

21 December 2007, 01:59:32

On Dec 20, 2007 6:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Merlin Moncure" <mmoncure@gmail.com> writes:
> > I don't really agree that wrapping pl/pgsql with encryptor/decryptor
> > is a bad idea.
>
> So if you want something other than endless arguments to happen,
> come up with a nice key-management design for encrypted function
> bodies.

Maybe a key management solution isn't required.  If, instead of
strictly wrapping a language with an encryption layer, we provide
hooks (actors) that have the ability to operate on the function body
when it arrives and leaves pg_proc, we may sidestep the key problem
(leaving it to the user) and open up the doors to new functionality at
the same time.

The actor is basically a callback taking the function source code (as
text) and returning text for storage in pg_proc.  Perhaps some other
house keeping variables such as function name, etc. are passed to the
actor as parameters as well.  The actor operates on the function body
going into pg_proc (input actors) and going out (output actors).  In
either case, the function 'body' is modified if necessary, and may
raise an error.

The validator can be considered an actor that doesn't modify the body.
 Ideally, the actors can be written in any pl language.  Naturally,
dealing with actors is for the superuser.  So, I'm suggesting to
extend the validator concept, opening it up to the user, giving it
more power, and the ability to operate in both directions.  The actor
will feel a lot like a trigger function.

Now, everything is left to the user...by adding an 'encryption' actor
to the language (trivial with pg_crypto), the user can broadly encrypt
in a manner of their choosing.  A clever user might write an actor to
encrypt a subset of functions in a language, or register the same
language twice with different actors.  Since the actor can call out to
other functions, we don't limit to a particular key management
strategy.

Another nice thing is we may solve a problem that's been bothering me
for years, namely that 'CREATE FUNCTION' takes a string literal and
not a string returning expression.  This is pretty limiting...there
are a broad range of reasons why I might want to modify the code
before it hits pg_proc.  For example, with an actor I can now feed the
data into the C preprocessor without giving up the ability of pasting
the function body directly into psql.

This isn't a fully developed idea, and I'm glossing over several areas
(for example, syntax to modify actors), and I'm not sure if it's a
good idea in principle...I might be missing an obvious reason why this
won't work.  OTOH, it seems like a really neat way to introduce
encryption.

comments? is it worth going down this road?

merlin

Re: function body actors (was: [PERFORM] viewing source code)

From

Tom Lane

Date:

21 December 2007, 01:59:52

"Merlin Moncure" <mmoncure@gmail.com> writes:
> On Dec 20, 2007 6:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So if you want something other than endless arguments to happen,
>> come up with a nice key-management design for encrypted function
>> bodies.

> Maybe a key management solution isn't required.  If, instead of
> strictly wrapping a language with an encryption layer, we provide
> hooks (actors) that have the ability to operate on the function body
> when it arrives and leaves pg_proc, we may sidestep the key problem
> (leaving it to the user) and open up the doors to new functionality at
> the same time.

I think you're focusing on mechanism and ignoring the question of
whether there is a useful policy for it to implement.  Andrew Sullivan
argued upthread that we cannot get anywhere with both keys and encrypted
function bodies stored in the same database (I hope that's an adequate
summary of his point).  I'm not convinced that he's right, but that has
to be the first issue we think about.  The whole thing is a dead end if
there's no way to do meaningful encryption --- punting an insoluble
problem to the user doesn't make it better.

(This is not to say that you don't have a cute idea there, only that
it's not a license to take our eyes off the ball.)

            regards, tom lane

Re: function body actors (was: [PERFORM] viewing source code)

From

"Merlin Moncure"

Date:

21 December 2007, 03:06:26

On Dec 21, 2007 12:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Merlin Moncure" <mmoncure@gmail.com> writes:
> > On Dec 20, 2007 6:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> So if you want something other than endless arguments to happen,
> >> come up with a nice key-management design for encrypted function
> >> bodies.
>
> > Maybe a key management solution isn't required.  If, instead of
> > strictly wrapping a language with an encryption layer, we provide
> > hooks (actors) that have the ability to operate on the function body
> > when it arrives and leaves pg_proc, we may sidestep the key problem
> > (leaving it to the user) and open up the doors to new functionality at
> > the same time.
>
> I think you're focusing on mechanism and ignoring the question of
> whether there is a useful policy for it to implement.  Andrew Sullivan
> argued upthread that we cannot get anywhere with both keys and encrypted
> function bodies stored in the same database (I hope that's an adequate
> summary of his point).  I'm not convinced that he's right, but that has
> to be the first issue we think about.  The whole thing is a dead end if
> there's no way to do meaningful encryption --- punting an insoluble
> problem to the user doesn't make it better.

Well, there is no 'one size fits all' policy. I'm still holding out
that we don't need any specific designs for this...simply offering the
example in the docs might get people started (just thinking out loud
here):

create function encrypt_proc(proname text, prosrc_in text, prosrc_out
out text) returns text as
$$
  declare
    key bytea;
  begin
    -- could be a literal variable, field from a  private table, temp
table, or 3rd party
    -- literal is dangerous, since its visible until 'create or
replaced' but thats maybe ok, depending
    key := get_key();
    select magic_string || encode(encrypt(prosrc_in, key, 'bf'),
'hex'); -- magic string prevents attempting to unencrypt non-encrypted
functions.
  end;
$$ language plpgsql;

-- ordering of actors is significant...need to think about that
alter language plpgsql add actor 'encrypt_proc' on input;
alter language plpgsql add actor 'decrypt_proc' on output;

If that's not enough, then you have build something more structured,
thinking about who provides the key and how the database asks for it.
The user would have to seed the session somehow (maybe, stored in a
temp table?) with a secret value which would be translated into the
key directly on the database or by a 3rd party over a secure channel.
The structured approach doesn't appeal to me much though...

The temp table idea might not be so hot, since it's trivial for the
database admin to see data from other user's temp tables, and maybe we
don't want that in some cases.  need to think about this some more...

merlin

Re: function body actors (was: [PERFORM] viewing source code)

From

"Pavel Stehule"

Date:

21 December 2007, 04:18:27

I have similar patch and it works. There is two isues:

* we missing column in pg_proc about state (not all procedures are
obfuscated), I solved it for plpgsl with using probin.
* decrypt is expensive on language handler level. Every session have
to do it again and again, better decrypt in system cache or somewhere
there.

Regards
Pavel Stehule

Re: function body actors (was: [PERFORM] viewing source code)

From

"Merlin Moncure"

Date:

21 December 2007, 10:13:46

On Dec 21, 2007 3:18 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I have similar patch and it works. There is two isues:
>
> * we missing column in pg_proc about state (not all procedures are
> obfuscated), I solved it for plpgsl with using probin.

I was hoping to avoid making any catalog or other changes to support
encryption specifically.  Maybe your patch stands on its own
merits...I missed the original discussion.  Do you think the code you
wrote can be adapted to do other things besides encryption?

> * decrypt is expensive on language handler level. Every session have
> to do it again and again, better decrypt in system cache or somewhere
> there.

Doesn't bother me in the least...and caching unencrypted data is
scary.  Also, aes256 is pretty fast for what it gives you and function
bodies are normally short.  The real issue as I see it is where to
keep the key.  How did you handle that?

merlin

Re: function body actors (was: [PERFORM] viewing source code)

From

"Pavel Stehule"

Date:

21 December 2007, 10:40:08

On 21/12/2007, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Dec 21, 2007 3:18 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> > I have similar patch and it works. There is two isues:
> >
> > * we missing column in pg_proc about state (not all procedures are
> > obfuscated), I solved it for plpgsl with using probin.
>
> I was hoping to avoid making any catalog or other changes to support
> encryption specifically.  Maybe your patch stands on its own
> merits...I missed the original discussion.  Do you think the code you
> wrote can be adapted to do other things besides encryption?
>

I don't know. It was fast hack that just works. It hat to do
obfuscation, and it do it well.

> > * decrypt is expensive on language handler level. Every session have
> > to do it again and again, better decrypt in system cache or somewhere
> > there.
>
> Doesn't bother me in the least...and caching unencrypted data is
> scary.  Also, aes256 is pretty fast for what it gives you and function
> bodies are normally short.  The real issue as I see it is where to
> keep the key.  How did you handle that?
>
> merlin
>

Simply. I use for password some random plpgsql message text and
compile it. I though  about GUC, and about storing password in
postgresql.conf. It's equal to protection level. We cannot protect
code on 100%. If you have admin or superuser account and if you know
some internal, you can simply get code.

http://blog.pgsql.cz/index.php?/archives/10-Obfuscator-PLpgSQL-procedur.html#extended

sorry for czech desc

Pavel

Re: function body actors (was: [PERFORM] viewing source code)

From

Tom Lane

Date:

21 December 2007, 12:19:24

"Pavel Stehule" <pavel.stehule@gmail.com> writes:
> On 21/12/2007, Merlin Moncure <mmoncure@gmail.com> wrote:
>> ... The real issue as I see it is where to
>> keep the key.  How did you handle that?

> Simply. I use for password some random plpgsql message text and
> compile it. I though  about GUC, and about storing password in
> postgresql.conf. It's equal to protection level. We cannot protect
> code on 100%. If you have admin or superuser account and if you know
> some internal, you can simply get code.

Yeah.  There is no defense against someone who is prepared to go in
there with a debugger and pull the post-decryption code out of memory.
So what we need to think about is what sorts of threats we *can* or
should defend against.  A couple of goals that seem like they might
be reasonable are:

* Even a superuser can't get the code at the SQL level, ie, it's
secure if you rule out debugger-level attacks.  (For example, this
might prevent someone who had remotely breached the superuser account
from getting the code.)

* Code not available if you just look at what's on-disk, ie, you can't
get it by stealing a backup tape.

Any other threats we could consider defending against?

BTW, this thread definitely doesn't belong on -performance anymore.
        regards, tom lane

Re: function body actors (was: [PERFORM] viewing source code)

From

Andrew Sullivan

Date:

21 December 2007, 12:24:53

On Fri, Dec 21, 2007 at 12:09:28AM -0500, Merlin Moncure wrote:
> Maybe a key management solution isn't required.  If, instead of
> strictly wrapping a language with an encryption layer, we provide
> hooks (actors) that have the ability to operate on the function body
> when it arrives and leaves pg_proc, we may sidestep the key problem
> (leaving it to the user) and open up the doors to new functionality at
> the same time.

I like this idea much better, because the same basic mechanism can be used
for more than one thing, and it doesn't build in a system that is
fundamentally weak.  Of course, you _can_ build a weak system this way, but
there's an important difference between building a fundamentally weak system
and making weak systems possible.

A

Re: function body actors (was: [PERFORM] viewing source code)

From

Tom Lane

Date:

21 December 2007, 12:48:07

Andrew Sullivan <ajs@crankycanuck.ca> writes:
> On Fri, Dec 21, 2007 at 12:09:28AM -0500, Merlin Moncure wrote:
>> Maybe a key management solution isn't required.

> I like this idea much better, because the same basic mechanism can be used
> for more than one thing, and it doesn't build in a system that is
> fundamentally weak.  Of course, you _can_ build a weak system this way, but
> there's an important difference between building a fundamentally weak system
> and making weak systems possible.

I find myself unconvinced by this argument.  The main problem is: how
do we know that it's possible to build a strong system atop this
mechanism?  Just leaving it to non-security-savvy users seems to me
to be a great way to guarantee a lot of weak systems in the field.
ISTM our minimum responsibility would be to design and document how
to build a strong protection system using the feature ... and at that
point why not build it in?

I've certainly got no objection to making a mechanism that can be used
for more than one purpose; but not offering a complete security solution
is abdicating our responsibility.
        regards, tom lane

Re: function body actors (was: [PERFORM] viewing source code)

From

Andrew Sullivan

Date:

21 December 2007, 12:48:43

On Fri, Dec 21, 2007 at 12:40:05AM -0500, Tom Lane wrote:

> whether there is a useful policy for it to implement.  Andrew Sullivan
> argued upthread that we cannot get anywhere with both keys and encrypted
> function bodies stored in the same database (I hope that's an adequate
> summary of his point).

It is.  I'm not a security expert, but I've been spending some time
listening to some of them lately.  The fundamental problem with a system
that stores the keys online in the same repository is not just its potential
for compromise, but its brittle failure mode: once the key is recovered,
you're hosed.  And there's no outside check of key validity, which means
attackers have a nicely-contained target to hit.

> I'm not convinced that he's right, but that has to be the first issue we
> think about.  The whole thing is a dead end if there's no way to do
> meaningful encryption --- punting an insoluble problem to the user doesn't
> make it better.

Well, one thing you could do with the proposal is build a PKCS#11 actor,
that could talk to an HSM.  Not everyone needs HSMs, of course, but they do
make online key storage much less risky (because correctly designed ones
make key recovery practically impossible).  So the mechanism can be made
effectively secure even for very strong cryptographic uses.

Weaker cases might use a two-level key approach, with a "data-signing key"
online all the time to do the basic encryption and validation, but a
key-signing key that is always offline or otherwise unavailable from within
the system.  The key signing key only authenticates (and doesn't encrypt)
the data signing key.  You could use a different actor for this, to provide
an interface to one-way functions or something.  This gives you a way to
revoke a data-signing key.  You couldn't protect already compromised data
this way, but at least you could prevent new disclosures.

Yes, I'm being hand-wavy now, but I can at least see how these different
approaches are possible under the suggestion, so it seems like a possibly
fruitful avenue to explore.  The more I think about it, actually, the more I
like it.

A

Re: function body actors (was: [PERFORM] viewing source code)

From

"Merlin Moncure"

Date:

21 December 2007, 13:49:04

On Dec 21, 2007 11:48 AM, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
> On Fri, Dec 21, 2007 at 12:40:05AM -0500, Tom Lane wrote:
>
> > whether there is a useful policy for it to implement.  Andrew Sullivan
> > argued upthread that we cannot get anywhere with both keys and encrypted
> > function bodies stored in the same database (I hope that's an adequate
> > summary of his point).
>
> It is.  I'm not a security expert, but I've been spending some time
> listening to some of them lately.  The fundamental problem with a system
> that stores the keys online in the same repository is not just its potential
> for compromise, but its brittle failure mode: once the key is recovered,
> you're hosed.  And there's no outside check of key validity, which means
> attackers have a nicely-contained target to hit.
>
> > I'm not convinced that he's right, but that has to be the first issue we
> > think about.  The whole thing is a dead end if there's no way to do
> > meaningful encryption --- punting an insoluble problem to the user doesn't
> > make it better.
>
> Well, one thing you could do with the proposal is build a PKCS#11 actor,
> that could talk to an HSM.  Not everyone needs HSMs, of course, but they do
> make online key storage much less risky (because correctly designed ones
> make key recovery practically impossible).  So the mechanism can be made
> effectively secure even for very strong cryptographic uses.

ISTM the main issue is how exactly the authenticated user interacts
with the actor to give it the information it needs to get the real
key.  This is significant because we don't want to be boxed into an
actor implementation that doesn't allow that interaction.  If simply
calling out via a function is enough (which, to be perfectly honest, I
don't know), then we can implement the actor system and let actor
implementations spring to life in contrib, pgfoundry, etc. as the
community presents them.

merlin

Re: function body actors (was: [PERFORM] viewing source code)

From

Tom Lane

Date:

21 December 2007, 14:57:51

"Merlin Moncure" <mmoncure@gmail.com> writes:
> ISTM the main issue is how exactly the authenticated user interacts
> with the actor to give it the information it needs to get the real
> key.  This is significant because we don't want to be boxed into an
> actor implementation that doesn't allow that interaction.

We don't?  What purpose would such a setup serve?  I would think
that for the applications we have in mind, the *last* thing you
want is for the end user to hold the key.  The whole point of this
is to keep him from seeing the function source code, remember?

Andrew's suggestion of an outside-the-database key server is
apropos, but I think it would end up being a situation where
the key server is under the control of whoever wrote the function
and wants to guard it against the end user.  The key server would
want some kind of authentication token but I think that could
perfectly well be an ID for the database server, not the individual
end user.  There's no need for anything as awkward as an interactive
sign-on, AFAICS.
        regards, tom lane

Re: function body actors (was: [PERFORM] viewing source code)

From

Andrew Sullivan

Date:

21 December 2007, 16:56:53

On Fri, Dec 21, 2007 at 01:57:44PM -0500, Tom Lane wrote:
> "Merlin Moncure" <mmoncure@gmail.com> writes:
> > ISTM the main issue is how exactly the authenticated user interacts
> > with the actor to give it the information it needs to get the real
> > key.  This is significant because we don't want to be boxed into an
> > actor implementation that doesn't allow that interaction.
> 
> We don't?  What purpose would such a setup serve?  I would think
> that for the applications we have in mind, the *last* thing you
> want is for the end user to hold the key.  The whole point of this
> is to keep him from seeing the function source code, remember?

Hmm; this may be exactly part of the problem, though.  It seems there are
two possible cases in play:

1.    Protect the content in the database (in this case, function bodies)
from _all_ users on a given server.  This is a case where you want to
protect (say) your function body from your users, because you have a
closed-source application.  

2.    Protect the content of a field from _some_ users on a given system,
based on the permissions they hold.  This is roughly analagous to others not
being able to look in the table I created, because I haven't GRANTed them
permission.

(2) is really a case for column-level access controls, I guess.  But if
we're trying to solve this problem too, then user passwords or something
make sense.

A

Re: function body actors (was: [PERFORM] viewing source code)

From

Tom Lane

Date:

21 December 2007, 17:19:58

Andrew Sullivan <ajs@crankycanuck.ca> writes:
> Hmm; this may be exactly part of the problem, though.  It seems there are
> two possible cases in play:

> 1.    Protect the content in the database (in this case, function bodies)
> from _all_ users on a given server.  This is a case where you want to
> protect (say) your function body from your users, because you have a
> closed-source application.  

> 2.    Protect the content of a field from _some_ users on a given system,
> based on the permissions they hold.  This is roughly analagous to others not
> being able to look in the table I created, because I haven't GRANTed them
> permission.

I would argue that (2) is reasonably well served today by setting up
separate databases for separate users.  The people who are complaining
seem to want to send out a set of functions into a hostile environment,
which is surely case (1).
        regards, tom lane

Re: function body actors (was: [PERFORM] viewing source code)

From

Andrew Sullivan

Date:

21 December 2007, 17:47:55

On Fri, Dec 21, 2007 at 04:19:51PM -0500, Tom Lane wrote:
> > 2.    Protect the content of a field from _some_ users on a given system,
> 
> I would argue that (2) is reasonably well served today by setting up
> separate databases for separate users. 

I thought actually this was one of the use-cases we were hearing.  Different
people using the same database (because the same data), with rules about the
different staff being able to see this or that function body.  I can easily
imagine such a case, for instance, in a large organization with different
departments and different responsibilities.  It seems a shame that the only
answer we have there is, "Give them different databases."  

I actually think organizations that think keeping function bodies secret
like this to be a good idea are organizations that will eventually make
really stupid mistakes.  But that doesn't mean they're not under the legal
requirement to do this.  For instance, my current employer has
(externally-mandated) organizational conflict of interest rules that require
all disclosure to be done exclusively as "need to know".  Under the right
(!) legal guidance, such a requirement could easily lead to rules about
function-body disclosure.  From my point of view, such a use case is way
more compelling than function-body encryption (although I understand that
one too).

A

Re: function body actors (was: [PERFORM] viewing source code)

From

"Joshua D. Drake"

Date:

21 December 2007, 18:05:24

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 21 Dec 2007 16:47:46 -0500
Andrew Sullivan <ajs@crankycanuck.ca> wrote:

> On Fri, Dec 21, 2007 at 04:19:51PM -0500, Tom Lane wrote:
> > > 2.    Protect the content of a field from _some_ users on a
> > > given system,
> > 
> > I would argue that (2) is reasonably well served today by setting up
> > separate databases for separate users. 
> 
> I thought actually this was one of the use-cases we were hearing.
> Different people using the same database (because the same data),
> with rules about the different staff being able to see this or that
> function body.  I can easily imagine such a case, for instance, in a
> large organization with different departments and different
> responsibilities.  It seems a shame that the only answer we have
> there is, "Give them different databases."  

I think there is a fundamental disconnect here. The "Give them
different databases" argument is essentially useless. Consider a
organization that has Sales and HR.

You don't give both a separate database. They need access to "some" of
each others information. Just not all.

Sincerely,

Joshua D. Drake

- -- 
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ 
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHbDkOATb/zqfZUUQRAqzmAJ9VhNXYtr/N7px3/iUenUJN+7r9jQCfQtj+
Hyfo4fLGrBGUN4jJcSgZEh0=
=xeLs
-----END PGP SIGNATURE-----

Re: function body actors (was: [PERFORM] viewing source code)

From

Marc Munro

Date:

21 December 2007, 20:28:43

On Fri, 2007-21-12 at 18:05 -0400, Andrew Sullivan <ajs@crankycanuck.ca>
wrote:

> > > 2.  Protect the content of a field from _some_ users on a given
> system,
> >
> > I would argue that (2) is reasonably well served today by setting up
> > separate databases for separate users.
>
> I thought actually this was one of the use-cases we were hearing.
> Different people using the same database (because the same data), with
> rules about the different staff being able to see this or that
> function body.  I can easily imagine such a case, for instance, in a
> large organization with different departments and different
> responsibilities.  It seems a shame that the only answer we have there
> is, "Give them different databases."

There is also Veil:
http://veil.projects.postgresql.org/curdocs/index.html


__
Marc