Thread: Looking for advice on database encryption

Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 16:40:20

What are folks doing to protect sensitive data in their databases?

We're running on the assumption that the _really_ sensitive data
is too sensitive for us to just trust the front-end programs that
connect to it.

The decision coming down from on-high is that we need to encrypt
certain fields.  That's fine, looked at pgcrypto, but found
the requirement to use pgp on the command line for key management
to be a problem.

So we're trying to implement the encryption in the front-end, but
the problem we're having is searching on the encrypted fields.  Since
we have to decrypt each field to search on it, queries that previously
took seconds now take minutes (or worse).

We've tested a number of cryptographic accelerator products.  In
case nobody else has tried this, let me give away the ending: none
that we've found are any faster than a typical server CPU.

So, it's a pretty open-ended question, since we're still pretty open
to different approaches, but how are others approaching this problem?

The goal here is that if we're going to encrypt the data, it should
be encrypted in such a way that if an attacker gets ahold of a dump
of the database, they still can't access the data without the
passphrases of the individuals who entered the data.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Looking for advice on database encryption

From

Thomas Kellerer

Date:

16 April 2009, 17:00:51

Bill Moran wrote on 16.04.2009 21:40:
> The goal here is that if we're going to encrypt the data, it should
> be encrypted in such a way that if an attacker gets ahold of a dump
> of the database, they still can't access the data without the
> passphrases of the individuals who entered the data.

I'm by far not an expert, but my naive attempt would be to store the the
database files in an encrypted filesystem.

Thomas

Re: Looking for advice on database encryption

From

John R Pierce

Date:

16 April 2009, 17:03:33

Bill Moran wrote:
> What are folks doing to protect sensitive data in their databases?
>

I would probably do my encryption in the application layer, and only
encrypt the sensitive fields.   fields used as indexes probably should
not be encrypted, unless the only index operation is EQ/NE, then you
could use the encrypted index value as the search key.   this would even
work for foreign key relations.

of course, if part of your cryptography regimen involves key expiration
and rotation, there'd be the hellacious problem of decrypting/reencryption.

it really all depends on what the security requirements are.
-somewhere- there's a weak spot, in the above model, its the application
server thats doing the cryptography, if it gets compromised, then the
keys can be extracted, and all bets are off.

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 17:20:37

In response to Thomas Kellerer <spam_eater@gmx.net>:

> Bill Moran wrote on 16.04.2009 21:40:
> > The goal here is that if we're going to encrypt the data, it should
> > be encrypted in such a way that if an attacker gets ahold of a dump
> > of the database, they still can't access the data without the
> > passphrases of the individuals who entered the data.
>
> I'm by far not an expert, but my naive attempt would be to store the the
> database files in an encrypted filesystem.

That was the first suggestion when we started brainstorming ideas.
Unfortunately, it fails to protect us from the most likely attack
vector: SQL Injection/application layer bugs.  In an SQL Injection
(for example) the fact that the filesystem is encrypted does zero
to protect the sensitive data.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Looking for advice on database encryption

From

Steve Atkins

Date:

16 April 2009, 17:37:50

On Apr 16, 2009, at 12:40 PM, Bill Moran wrote:

(This is the traditional "you're asking the wrong question" response).

>
> What are folks doing to protect sensitive data in their databases?

I don't think that's a useful way to look at it. Protecting sensitive
data in the entire system, where the database is just one
part of that system is likely to lead to a much better answer.

>
> We're running on the assumption that the _really_ sensitive data
> is too sensitive for us to just trust the front-end programs that
> connect to it.
>
> The decision coming down from on-high is that we need to encrypt
> certain fields.

If that's the mandate, then that's what you have to do. It's unlikely to
make the system overall much more secure, and likely no more secure
than some much less intrusive approaches, though.

> That's fine, looked at pgcrypto, but found
> the requirement to use pgp on the command line for key management
> to be a problem.
>
> So we're trying to implement the encryption in the front-end, but
> the problem we're having is searching on the encrypted fields.  Since
> we have to decrypt each field to search on it, queries that previously
> took seconds now take minutes (or worse).
>
> We've tested a number of cryptographic accelerator products.  In
> case nobody else has tried this, let me give away the ending: none
> that we've found are any faster than a typical server CPU.
>
> So, it's a pretty open-ended question, since we're still pretty open
> to different approaches, but how are others approaching this problem?
>
> The goal here is that if we're going to encrypt the data, it should
> be encrypted in such a way that if an attacker gets ahold of a dump
> of the database, they still can't access the data without the
> passphrases of the individuals who entered the data.

If the concern is database dumps, then encrypting the output of
pg_dump will pretty much solve the problem. But if the attack
vector is the common one of compromising the front end, then
encrypting data in the database, but allowing the front end to
decrypt it is likely useless. If the concern is "what if an attacker
got access to the server?" then physical security is likely to have
much better ROI than some random encryption regime.

Can you go back and ask your management what their actual
security or compliance needs are?

If it's a real business need you probably want to find a decent
security guy and have him draft the questions that management
need to answer and start from there, rather than trying to clean
up after someone has already made 95% of the decisions, in
an uninformed way, for you.

Cheers,
   Steve

Re: Looking for advice on database encryption

From

"Tim Bruce - Postgres"

Date:

16 April 2009, 17:50:33

On Thu, April 16, 2009 13:20, Bill Moran wrote:
> In response to Thomas Kellerer <spam_eater@gmx.net>:
>
>> Bill Moran wrote on 16.04.2009 21:40:
>> > The goal here is that if we're going to encrypt the data, it should
>> > be encrypted in such a way that if an attacker gets ahold of a dump
>> > of the database, they still can't access the data without the
>> > passphrases of the individuals who entered the data.
>>
>> I'm by far not an expert, but my naive attempt would be to store the the
>> database files in an encrypted filesystem.
>
> That was the first suggestion when we started brainstorming ideas.
> Unfortunately, it fails to protect us from the most likely attack
> vector: SQL Injection/application layer bugs.  In an SQL Injection
> (for example) the fact that the filesystem is encrypted does zero
> to protect the sensitive data.
>
> --
> Bill Moran
> http://www.potentialtech.com
> http://people.collaborativefusion.com/~wmoran/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

I'll chime in here, even though I probably shouldn't.  A lot is dependent
on what standard you're trying to meet.  General Security (and Common
Sense) vs PCI/DSS vs NSA/DoD vs some other standard.

Do you need to decrypt the values once they're in the system?

Do you need the items in an index?

Do the values need to be part of a constraint / foreign key relationship
(because a hashed value may cause you a lot of headaches!)?

Look at these different scenarios and think about the data (both in
encrypted format and unencrypted format) before you decide HOW you want to
do it.

Tim
--
Timothy J. Bruce

Re: Looking for advice on database encryption

From

Thomas Kellerer

Date:

16 April 2009, 17:52:25

Bill Moran wrote on 16.04.2009 22:20:
>> I'm by far not an expert, but my naive attempt would be to store the the
>> database files in an encrypted filesystem.
>
> That was the first suggestion when we started brainstorming ideas.
> Unfortunately, it fails to protect us from the most likely attack
> vector: SQL Injection/application layer bugs.  In an SQL Injection
> (for example) the fact that the filesystem is encrypted does zero
> to protect the sensitive data.
>

Which is something different than your statement

 >> The goal here is that if we're going to encrypt the data, it should
 >> be encrypted in such a way that if an attacker gets ahold of a dump
 >> of the database, they still can't access the data without the
 >> passphrases of the individuals who entered the data.

which only talks about someone getting hold of the contents of the server's
harddisk.

As you have to ultimately decrypt the data to display it to the user, he can
always take a screenshot (or copy & paste the text from the web front end) and
walk away. He doesn't even need to use some SQL injection.

To prevent SQL injection there are pretty robust solutions for this (prepared
statements, sanitizing and cleaning any user input, maybe even control the
access to the data by stored procedures which can add an additional layer of
security)

I agree with Kenneth: you need to be more precise on which scenario you have to
deal with.


Thomas

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 18:06:20

In response to Steve Atkins <steve@blighty.com>:
>
> On Apr 16, 2009, at 12:40 PM, Bill Moran wrote:
>
> (This is the traditional "you're asking the wrong question" response).
>
> >
> > What are folks doing to protect sensitive data in their databases?
>
> I don't think that's a useful way to look at it. Protecting sensitive
> data in the entire system, where the database is just one
> part of that system is likely to lead to a much better answer.

<snip>

I disagree.  We're already addressing the issues of security on the
application level through extensive testing, data validation out the
wazoo (to prevent SQL Injection and other application breaches).  All
our servers are in highly secure data centers.  We have VPNs and
access restrictions at the IP and the user level to the 9s.

It's still not enough.

My task here is to develop a system to protect the data in the event
that all of those fail.  As a result, I'm looking for general advice.

I already have a system in place.  This is apparently another part
that I should have described in more detail.  So, here goes:

To draw a parallel example on the application:
Imagine that you're an employee in a business.  When you're hired, you
enter your SSN into the company database.  Now, your department manager
needs to have access to your SSN for various reasons, so the system
grants access to your encryption key to the department manager.  Based
on system policy, the division manager has access to all the data in
the department, and the company head has access to all divisions.  As
a result, the company head can get your SSN out of the database using
the passphrase for his key.

However, Joe, over in IT can not access your SSN.  Even though he's the
DBA and can pull a full text dump of the database at will, he can not
decrypt your SSN unless he has a passphrase to one of the keys that
can decrypt it.

All that is pretty standard PKI stuff, and I've created the tables and
the functions that implement it.

The problem comes when the company head wants to search through the
database to find out which employee has a specific SSN.  He should
be able to do so, since he has access to everything, but the logistics of
doing so in a reasonable amount of time are rather complex and very
time consuming.  On a million rows with the SSN unencrypted, such a
query would take less than a second with an appropriate index, but
pulling those million rows into the application in order to decrypt
each one and see if it matches can easily take a half hour or longer.

That's where we're having difficulty.  Our requirements are that the
data must be strongly protected, but the appropriate people must be
able to do (often complex) searches on it that complete in record
time.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 18:06:33

In response to Thomas Kellerer <spam_eater@gmx.net>:

> Bill Moran wrote on 16.04.2009 22:20:
> >> I'm by far not an expert, but my naive attempt would be to store the the
> >> database files in an encrypted filesystem.
> >
> > That was the first suggestion when we started brainstorming ideas.
> > Unfortunately, it fails to protect us from the most likely attack
> > vector: SQL Injection/application layer bugs.  In an SQL Injection
> > (for example) the fact that the filesystem is encrypted does zero
> > to protect the sensitive data.
>
> Which is something different than your statement
>
>  >> The goal here is that if we're going to encrypt the data, it should
>  >> be encrypted in such a way that if an attacker gets ahold of a dump
>  >> of the database, they still can't access the data without the
>  >> passphrases of the individuals who entered the data.
>
> which only talks about someone getting hold of the contents of the server's
> harddisk.

Not really.  You're making an assumption that a pg_dump can only be
run on the server itself.  Let's chalk this up to miscommunication
and allow me to rephrase:

The data needs to be encrypted in such a way that if an attacker can
get an offline copy of the data by any means, they have no greater
access to the data that they would if they used the application to
access it.

I already have that using PKI.  Again, it seems that I left too many
details out of my description of the problem.  See my post in
response to Steve Atkins for a more detailed description of the
problem, and I apologize for being too vague the first go-round.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Looking for advice on database encryption

From

Thomas Kellerer

Date:

16 April 2009, 18:13:56

Bill Moran wrote on 16.04.2009 23:06:
>> which only talks about someone getting hold of the contents of the server's
>> harddisk.
>
> Not really.  You're making an assumption that a pg_dump can only be
> run on the server itself.

Right, I forgot that.

But then it's similar to the situation where the user displays the data and
walks away with the screenshot...

If you have an application server sitting in the middle you can limit
connections to the database to the app server itself. Or even put the appserver
on the same box as the database server and limit connections only to localhost.
In that case the attacker needs to be able to log-in to the server directly.


> and I apologize for being too vague the first go-round.

No problem. This happens to me all the time. Once a discussion starts about a
topic I find myself wondering how I could forget all the details that I'm being
asked about ;)


Thomas

Re: Looking for advice on database encryption

From

"Will Rutherdale (rutherw)"

Date:

16 April 2009, 18:18:36

Couldn't you just add a PGP based column (or similar encryption
protocol) for authentication?  This would protect you against injection
attacks, would it not?

You could also use PGP or similar for key management if I'm not
mistaken.

-Will

-----Original Message-----
In response to Thomas Kellerer <spam_eater@gmx.net>:

That was the first suggestion when we started brainstorming ideas.
Unfortunately, it fails to protect us from the most likely attack
vector: SQL Injection/application layer bugs.  In an SQL Injection
(for example) the fact that the filesystem is encrypted does zero
to protect the sensitive data.

Re: Looking for advice on database encryption

From

John R Pierce

Date:

16 April 2009, 19:01:04

Bill Moran wrote:
> The problem comes when the company head wants to search through the
> database to find out which employee has a specific SSN.  He should
> be able to do so, since he has access to everything, but the logistics of
> doing so in a reasonable amount of time are rather complex and very
> time consuming.  On a million rows with the SSN unencrypted, such a
> query would take less than a second with an appropriate index, but
> pulling those million rows into the application in order to decrypt
> each one and see if it matches can easily take a half hour or longer.
>
> That's where we're having difficulty.  Our requirements are that the
> data must be strongly protected, but the appropriate people must be
> able to do (often complex) searches on it that complete in record
> time.
>

an index on the encrypted SSN field would do this just fine.     if
authorized person needs to find the record with a specific SSN, they
encrypt that SSN and then look up the ciphertext in the database...  done.

Re: Looking for advice on database encryption

From

Michael Black

Date:

16 April 2009, 19:04:30

If the purpose of encrypting the data is just to keep prying eyes from decerning what that data is then a simple encryption can be coded. something like adding 128 or 256, depending on the character set, to each of the chr(value) for each of the characters in the string should work just fine. You could also use a bitwise shift or xor to change the value of each character.

If the purpose of encryption is for financial or medica data transmission security, or something of a higher order, you may want to implement a stronger type of security such as SSL or PGP or some other type of public/private key process.

You could create a schema that contains views of the data with out the sensitive data and have the users use that schema for their needs, assumes that it basically used to view or report on the data.

Just some thoughts.

Michael Black

> Date: Thu, 16 Apr 2009 15:40:12 -0400
> From: wmoran@potentialtech.com
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Looking for advice on database encryption
>
>
> What are folks doing to protect sensitive data in their databases?
>
> We're running on the assumption that the _really_ sensitive data
> is too sensitive for us to just trust the front-end programs that
> connect to it.
>
> The decision coming down from on-high is that we need to encrypt
> certain fields. That's fine, looked at pgcrypto, but found
> the requirement to use pgp on the command line for key management
> to be a problem.
>
> So we're trying to implement the encryption in the front-end, but
> the problem we're having is searching on the encrypted fields. Since
> we have to decrypt each field to search on it, queries that previously
> took seconds now take minutes (or worse).
>
> We've tested a number of cryptographic accelerator products. In
> case nobody else has tried this, let me give away the ending: none
> that we've found are any faster than a typical server CPU.
>
> So, it's a pretty open-ended question, since we're still pretty open
> to different approaches, but how are others approaching this problem?
>
> The goal here is that if we're going to encrypt the data, it should
> be encrypted in such a way that if an attacker gets ahold of a dump
> of the database, they still can't access the data without the
> passphrases of the individuals who entered the data.
>
> --
> Bill Moran
> http://www.potentialtech.com
> http://people.collaborativefusion.com/~wmoran/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Re: Looking for advice on database encryption

From

"Jonathan Bond-Caron"

Date:

16 April 2009, 20:06:24

On Thu Apr 16 05:06 PM, Bill Moran wrote:
>
> The problem comes when the company head wants to search through the
> database to find out which employee has a specific SSN.  He should be
> able to do so, since he has access to everything, but the logistics of
> doing so in a reasonable amount of time are rather complex and very
> time consuming.  On a million rows with the SSN unencrypted, such a
> query would take less than a second with an appropriate index, but
> pulling those million rows into the application in order to decrypt
> each one and see if it matches can easily take a half hour or longer.
>
> That's where we're having difficulty.  Our requirements are that the
> data must be strongly protected, but the appropriate people must be
> able to do (often complex) searches on it that complete in record time.
>
> --

Would storing a one-way hash of the SSN work for you? i.e. combine sha1
and/or md5, use a salt...

SELECT ssn_encrypted FROM employees WHERE ssn_hash =
yourhashmethod(SSN_PLAINTEXT)

So you have both an encrypted version of the SSN and a one-way hash of it.

That's how we store credit card numbers.

Re: Looking for advice on database encryption

From

John R Pierce

Date:

16 April 2009, 21:24:49

Eric Soroos wrote:
>> an index on the encrypted SSN field would do this just fine.     if
>> authorized person needs to find the record with a specific SSN, they
>> encrypt that SSN and then look up the ciphertext in the database...
>> done.
>>
>
> This will only work for e(lectronic?) code book ciphers, and not
> chained block ciphers, since the initialization vector will randomize
> the output of the encryption so that E(foo) != E(foo) just to prevent
> this sort of attack.

can those sorts of chained block ciphers decode blocks in a different
order than they were originally encoded?    for this sort of
application, wouldn't each field or record pretty much have to be
encrypted discretely so that they can be decrypted in any order, or any
single record be decrypted on its own?

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 21:29:10

Thomas Kellerer <spam_eater@gmx.net> wrote:
>
> Bill Moran wrote on 16.04.2009 23:06:
> >> which only talks about someone getting hold of the contents of the server's
> >> harddisk.
> >
> > Not really.  You're making an assumption that a pg_dump can only be
> > run on the server itself.
>
> Right, I forgot that.
>
> But then it's similar to the situation where the user displays the data and
> walks away with the screenshot...

Actually, it's completely different.  If a user walks away with a screenshot
of data that they had access to anyway, then the application developer is
not culpable.

However, if a flaw is found in the application and a user can use it to
gain escalated privs and access data that would normally not be available,
the application developer is going out of business.

If a user finds a flaw, but it simply result in an error because the layer
of security behind it prevents an information leak, then the application
developer doesn't look very bad at all.  Layered security saves the day!

> If you have an application server sitting in the middle you can limit
> connections to the database to the app server itself. Or even put the appserver
> on the same box as the database server and limit connections only to localhost.
> In that case the attacker needs to be able to log-in to the server directly.

You're assuming that the application is perfect.  With the data we're
protecting, we don't have that luxury.

This isn't a particularly new view of security.  CERT has hundreds or pages
documented on how this is correct security practice.  If it wasn't there
wouldn't need to be firewalls between Windows servers and the Internet.

The part that's unique (from my experience) is the demand that the data
be so readily assessable.  Usually, highly secure data is understood to
be difficult to access, but that understanding doesn't exist in this
market.  It's an unreasonable expectation on the part of our clients, to
be honest, but if we can find a way to meet it, we leave the competition
in the dust.

Thanks for the feedback so far.

--
Bill Moran
http://www.potentialtech.com

Re: Looking for advice on database encryption

From

Eric Soroos

Date:

16 April 2009, 21:30:33

>>
>> That's where we're having difficulty.  Our requirements are that the
>> data must be strongly protected, but the appropriate people must be
>> able to do (often complex) searches on it that complete in record
>> time.
>>
>
> an index on the encrypted SSN field would do this just fine.     if
> authorized person needs to find the record with a specific SSN, they
> encrypt that SSN and then look up the ciphertext in the database...
> done.
>

This will only work for e(lectronic?) code book ciphers, and not
chained block ciphers, since the initialization vector will randomize
the output of the encryption so that E(foo) != E(foo) just to prevent
this sort of attack.

You're looking for a hash function, since that's a one way, stable
function meant for comparing.

eric

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 21:40:27

"Will Rutherdale (rutherw)" <rutherw@cisco.com> wrote:
>
> Couldn't you just add a PGP based column (or similar encryption
> protocol) for authentication?  This would protect you against injection
> attacks, would it not?
>
> You could also use PGP or similar for key management if I'm not
> mistaken.

Thanks for the input, Will.  We're already doing this, the problem we've
had is that the time to decrypt the data is making access too slow.

Basically, people administrators need to be able to say, "show me all the
registrants whose personal medical information is x" and get results in
a reasonable amount of time.  Decrypting the data to do the matching is
about 100x slower than a typical seq scan.

To give you an idea of what we've tried, I've tried pgcrypto, openssl with
rc4, des and 3des, using envelope encryption, and raw aes-128 symmetrical
encryption.  In addition, we've purchased two different hardware
accelerators for crypto to find that both of them are slower than the
CPU itself, and they're both the high-end "enterprise" class cards.

--
Bill Moran
http://www.potentialtech.com

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 21:42:12

Michael Black <michaelblack75052@hotmail.com> wrote:
>
> If the purpose of encryption is for financial or medica data transmission security, or something of a higher order,
youmay want to implement a stronger type of security such as SSL or PGP or some other type of public/private key
process.
>
> You could create a schema that contains views of the data with out the sensitive data and have the users use that
schemafor their needs, assumes that it basically used to view or report on the data. 

Thanks for the input, Michael.  We're already working on using PKI, the
big problem we're having is the speed of access when an administrator
needs to search through the encrypted data.

--
Bill Moran
http://www.potentialtech.com

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 22:10:21

John R Pierce <pierce@hogranch.com> wrote:
>
> Eric Soroos wrote:
> >> an index on the encrypted SSN field would do this just fine.     if
> >> authorized person needs to find the record with a specific SSN, they
> >> encrypt that SSN and then look up the ciphertext in the database...
> >> done.
> >
> > This will only work for e(lectronic?) code book ciphers, and not
> > chained block ciphers, since the initialization vector will randomize
> > the output of the encryption so that E(foo) != E(foo) just to prevent
> > this sort of attack.
>
> can those sorts of chained block ciphers decode blocks in a different
> order than they were originally encoded?    for this sort of
> application, wouldn't each field or record pretty much have to be
> encrypted discretely so that they can be decrypted in any order, or any
> single record be decrypted on its own?

Eric is right about CBC ciphers.  The problem is that any function that
will produce the same output for the same input (such as md5 or sha) leaves
us open to brute force attacks if the number of choices is small, or
pattern discovery attacks in other cases.  And anything that protects
us against such attacks (such as aes-cbc) will generate data that I
can't pre-encrypt and search against.

I haven't tried it, but I don't believe CBC ciphers can decrypt data out
of order.

In the implementation I've built, the IV is stored with the ciphertext,
much the same way that crypt() stores the salt with the password hash.
As a result, if you have the key, you then have all the data required
to decrypt the field, but you can't easily brute force it or do any
pattern analysis.

--
Bill Moran
http://www.potentialtech.com

Re: Looking for advice on database encryption

From

Bill Moran

Date:

16 April 2009, 22:27:05

"Jonathan Bond-Caron" <jbondc@openmv.com> wrote:
>
> On Thu Apr 16 05:06 PM, Bill Moran wrote:
> >
> > The problem comes when the company head wants to search through the
> > database to find out which employee has a specific SSN.  He should be
> > able to do so, since he has access to everything, but the logistics of
> > doing so in a reasonable amount of time are rather complex and very
> > time consuming.  On a million rows with the SSN unencrypted, such a
> > query would take less than a second with an appropriate index, but
> > pulling those million rows into the application in order to decrypt
> > each one and see if it matches can easily take a half hour or longer.
> >
> > That's where we're having difficulty.  Our requirements are that the
> > data must be strongly protected, but the appropriate people must be
> > able to do (often complex) searches on it that complete in record time.
>
> Would storing a one-way hash of the SSN work for you? i.e. combine sha1
> and/or md5, use a salt...
>
> SELECT ssn_encrypted FROM employees WHERE ssn_hash =
> yourhashmethod(SSN_PLAINTEXT)
>
> So you have both an encrypted version of the SSN and a one-way hash of it.
>
> That's how we store credit card numbers.

We're considering that for some fields.  It does limit a lot ... we can't
do partial matching for example.

Other fields don't work so well.  If I try to use that trick on a field
that has too few choices (true/false is the worst, but anything with
less than a few thousand possibilities, i.e. "county of residence" is
a problem) then I've left it open to an easy dictionary attack.

Thanks for the input.

--
Bill Moran
http://www.potentialtech.com

Re: Looking for advice on database encryption

From

Chris.Ellis@shropshire.gov.uk

Date:

17 April 2009, 06:34:20

> What are folks doing to protect sensitive data in their databases? > > We're running on the assumption that the _really_ sensitive data > is too sensitive for us to just trust the front-end programs that > connect to it. > > The decision coming down from on-high is that we need to encrypt > certain fields. That's fine, looked at pgcrypto, but found > the requirement to use pgp on the command line for key management > to be a problem. > > So we're trying to implement the encryption in the front-end, but > the problem we're having is searching on the encrypted fields. Since > we have to decrypt each field to search on it, queries that previously > took seconds now take minutes (or worse). > > We've tested a number of cryptographic accelerator products. In > case nobody else has tried this, let me give away the ending: none > that we've found are any faster than a typical server CPU. > > So, it's a pretty open-ended question, since we're still pretty open > to different approaches, but how are others approaching this problem? > > The goal here is that if we're going to encrypt the data, it should > be encrypted in such a way that if an attacker gets ahold of a dump > of the database, they still can't access the data without the > passphrases of the individuals who entered the data.
Take the performance hit, If people on high want the data encrypted, then they have to suffer the perfromance penalty, however bad.

Could you not write some server extensions to encrypt / decrypt the data server side, coupled with a custom index implementation?

Can you use a global server side key or do you need fine grained encryption?

Is a database the correct tool for the job if you want this level of encryption and granularity?

Also, how secure are you communication channels, what stops me snooping the data in transit, ARP posioning and other techniques etc.

Chris Ellis

******************************************************************************

If you are not the intended recipient of this email please do not send it on

to others, open any attachments or file the email locally.

Please inform the sender of the error and then delete the original email.

For more information, please refer to http://www.shropshire.gov.uk/privacy.nsf

******************************************************************************

Re: Looking for advice on database encryption

From

Bill Moran

Date:

17 April 2009, 07:44:42

In response to Chris.Ellis@shropshire.gov.uk:

> > What are folks doing to protect sensitive data in their databases?
> >
> > We're running on the assumption that the _really_ sensitive data
> > is too sensitive for us to just trust the front-end programs that
> > connect to it.
> >
> > The decision coming down from on-high is that we need to encrypt
> > certain fields.  That's fine, looked at pgcrypto, but found
> > the requirement to use pgp on the command line for key management
> > to be a problem.
> >
> > So we're trying to implement the encryption in the front-end, but
> > the problem we're having is searching on the encrypted fields.  Since
> > we have to decrypt each field to search on it, queries that previously
> > took seconds now take minutes (or worse).
> >
> > We've tested a number of cryptographic accelerator products.  In
> > case nobody else has tried this, let me give away the ending: none
> > that we've found are any faster than a typical server CPU.
> >
> > So, it's a pretty open-ended question, since we're still pretty open
> > to different approaches, but how are others approaching this problem?
> >
> > The goal here is that if we're going to encrypt the data, it should
> > be encrypted in such a way that if an attacker gets ahold of a dump
> > of the database, they still can't access the data without the
> > passphrases of the individuals who entered the data.
>
> Take the performance hit, If people on high want the data encrypted, then
> they have to suffer the performance penalty, however bad.

As reasonable as that sounds, I don't think it's true.  We've already
brainstormed a dozen ways to work around the performance issue (creative
hashing, backgrounding the decryption and using ajax to display the
results as they're decrypted ...)

Problem is that all of these methods complicate things in the
application.  I was hoping there were better approaches to the
solution, but I'm starting to think that we're already on the
right path.

> Could you not write some server extensions to encrypt / decrypt the data
> server side, coupled with a custom index implementation?

Not sure how the index implementation would work.  The server-side
encryption doesn't really help much ... it's difficult to add more
DB servers in order to improve throughput, but adding more web
servers fits easily into our load balanced setup.  In any event,
the addition of processing cores (not matter where) doesn't speed
up the decryption of individual items, it only allows us to do more
in parallel.

> Can you use a global server side key or do you need fine grained
> encryption?
>
> Is a database the correct tool for the job if you want this level of
> encryption and granularity?

The global side key puts us in pretty much the same situation that
filesystem encryption does, which is not quite as strong as we're
looking for.

I've considered the possibility of using something other than the
DB, but I can't think of any storage method that gains us anything over
the DB.  Also, if we use something different than the DB, we then have
to come up with a way to replicated it to the backup datacenter.  If
we put the data in the DB, slony is already set up to take care of that.

> Also, how secure are you communication channels, what stops me snooping
> the data in transit, ARP posioning and other techniques etc.

We do what we can.  Everything is transferred over HTTPS, and we log and
monitor activity.  We're constantly looking for ways to improve that
side of things as well, but that's a discussion for a different forum.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Looking for advice on database encryption

From

Chris.Ellis@shropshire.gov.uk

Date:

17 April 2009, 08:18:24

> > Take the performance hit, If people on high want the data encrypted, then > > they have to suffer the performance penalty, however bad. > > As reasonable as that sounds, I don't think it's true. We've already > brainstormed a dozen ways to work around the performance issue (creative > hashing, backgrounding the decryption and using ajax to display the > results as they're decrypted ...) > > Problem is that all of these methods complicate things in the > application. I was hoping there were better approaches to the > solution, but I'm starting to think that we're already on the > right path.
> > > Could you not write some server extensions to encrypt / decrypt the data > > server side, coupled with a custom index implementation? > > Not sure how the index implementation would work. The server-side > encryption doesn't really help much ... it's difficult to add more > DB servers in order to improve throughput, but adding more web > servers fits easily into our load balanced setup. In any event, > the addition of processing cores (not matter where) doesn't speed > up the decryption of individual items, it only allows us to do more > in parallel.

Move all DB calls to stored procedures, let the stored procedures handle the encryption / decryption with a given key.
If your communication channels are secure then this is just as secure as decrypting the data in the application.

This also allows DB's to be clustered, with the likes of PL/Proxy.

You could create a custom datatype to hold the encrypted data, then functions to access it.
> > Can you use a global server side key or do you need fine grained > > encryption? > > > > Is a database the correct tool for the job if you want this level of > > encryption and granularity? > The global side key puts us in pretty much the same situation that > filesystem encryption does, which is not quite as strong as we're > looking for. > I've considered the possibility of using something other than the > DB, but I can't think of any storage method that gains us anything over > the DB. Also, if we use something different than the DB, we then have > to come up with a way to replicated it to the backup datacenter. If > we put the data in the DB, slony is already set up to take care of that.

File system, Leave the replication upto the SAN. Store your data in flat files which are encrypted with each key, an Index per user etc.
> > > Also, how secure are you communication channels, what stops me snooping > > the data in transit, ARP posioning and other techniques etc. > > We do what we can. Everything is transferred over HTTPS, and we log and > monitor activity. We're constantly looking for ways to improve that > side of things as well, but that's a discussion for a different forum. > > -- > Bill Moran > http://www.potentialtech.com > http://people.collaborativefusion.com/~wmoran/ > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general

******************************************************************************

If you are not the intended recipient of this email please do not send it on

to others, open any attachments or file the email locally.

Please inform the sender of the error and then delete the original email.

For more information, please refer to http://www.shropshire.gov.uk/privacy.nsf

******************************************************************************

Re: Looking for advice on database encryption

From

Sam Mason

Date:

17 April 2009, 10:11:55

On Thu, Apr 16, 2009 at 05:06:13PM -0400, Bill Moran wrote:
> I disagree.  We're already addressing the issues of security on the
> application level through extensive testing, data validation out the
> wazoo (to prevent SQL Injection and other application breaches).  All
> our servers are in highly secure data centers.  We have VPNs and
> access restrictions at the IP and the user level to the 9s.
>
> It's still not enough.
>
> My task here is to develop a system to protect the data in the event
> that all of those fail.  As a result, I'm looking for general advice.

Mine would be to define what you do trust and not what you don't trust.
I think you need to do that before you can get much further.  At the
moment the problem seems somewhat ill defined.

For example; you say that you don't trust the application, yet the user
must trust the application as they're entering their secret into it.
How does the user ascertain that the application they're talking to is
the "real" one and that it hasn't been replaced with a pretend one that
sends their secret off to an attacker who has access to a real version
of the program?

Protecting against this in general is, as far as I know, is impossible.
The get out clause is that you're not trying to solve the general case,
you've got a specific set of use cases that you need to solve.

--
  Sam  http://samason.me.uk/

Re: Looking for advice on database encryption

From

Bill Moran

Date:

17 April 2009, 10:52:38

In response to Sam Mason <sam@samason.me.uk>:

> On Thu, Apr 16, 2009 at 05:06:13PM -0400, Bill Moran wrote:
> > I disagree.  We're already addressing the issues of security on the
> > application level through extensive testing, data validation out the
> > wazoo (to prevent SQL Injection and other application breaches).  All
> > our servers are in highly secure data centers.  We have VPNs and
> > access restrictions at the IP and the user level to the 9s.
> >
> > It's still not enough.
> >
> > My task here is to develop a system to protect the data in the event
> > that all of those fail.  As a result, I'm looking for general advice.
>
> Mine would be to define what you do trust and not what you don't trust.
> I think you need to do that before you can get much further.  At the
> moment the problem seems somewhat ill defined.
>
> For example; you say that you don't trust the application, yet the user
> must trust the application as they're entering their secret into it.
> How does the user ascertain that the application they're talking to is
> the "real" one and that it hasn't been replaced with a pretend one that
> sends their secret off to an attacker who has access to a real version
> of the program?
>
> Protecting against this in general is, as far as I know, is impossible.
> The get out clause is that you're not trying to solve the general case,
> you've got a specific set of use cases that you need to solve.

The primary portal into the application right now is a web site.  As
a result, this part of it is handled by typical SSL certs and the like.

As far as the trust factor, you've blurred the lines a bit.  My job
is to ensure that the user doesn't know or care about the lines between
application and database, but trusts the system as a whole.  However,
I need to clearly define those lines and ensure that each part of
the whole has enough security measures to withstand a flaw in one
of the other parts.  Think of the design of postfix, where each
program (smtpd, qmgr, etc) doesn't trust the input of the other
programs and runs in its own sandbox.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Looking for advice on database encryption

From

Sam Mason

Date:

17 April 2009, 11:04:48

On Fri, Apr 17, 2009 at 09:52:30AM -0400, Bill Moran wrote:
> In response to Sam Mason <sam@samason.me.uk>:
> > For example; you say that you don't trust the application, yet the user
> > must trust the application as they're entering their secret into it.
> > How does the user ascertain that the application they're talking to is
> > the "real" one and that it hasn't been replaced with a pretend one that
> > sends their secret off to an attacker who has access to a real version
> > of the program?
>
> The primary portal into the application right now is a web site.  As
> a result, this part of it is handled by typical SSL certs and the like.

OK, that defers the problem nicely.

> As far as the trust factor, you've blurred the lines a bit.  My job
> is to ensure that the user doesn't know or care about the lines between
> application and database, but trusts the system as a whole.  However,
> I need to clearly define those lines and ensure that each part of
> the whole has enough security measures to withstand a flaw in one
> of the other parts.  Think of the design of postfix, where each
> program (smtpd, qmgr, etc) doesn't trust the input of the other
> programs and runs in its own sandbox.

Sorry; my example of where to place trust was a bad one, lets try some
other ones:

The Postgres process; do you trust that the database engine is secure?
This implies that the frontend program can send the user's secret to the
database engine and the decryption will be done "inside" the database.
I believe this to be the case, otherwise for the user to query on SSN,
to pick an example you were using before, you would need to send *every*
encrypted SSN to the client where they would decrypt it with their secret
to find the one they wanted.

Backups; you mentioned that if someone stole the backups they shouldn't
be able to get any more information than if they were using the client
interface.  If every sensitive field is encrypted then you're protected
against some attacks, but you'd be better encrypting the backup.  Where
is it OK to place the trust here?

--
  Sam  http://samason.me.uk/

Re: Looking for advice on database encryption

From

Bill Moran

Date:

17 April 2009, 11:33:22

In response to Sam Mason <sam@samason.me.uk>:
>
> > As far as the trust factor, you've blurred the lines a bit.  My job
> > is to ensure that the user doesn't know or care about the lines between
> > application and database, but trusts the system as a whole.  However,
> > I need to clearly define those lines and ensure that each part of
> > the whole has enough security measures to withstand a flaw in one
> > of the other parts.  Think of the design of postfix, where each
> > program (smtpd, qmgr, etc) doesn't trust the input of the other
> > programs and runs in its own sandbox.
>
> Sorry; my example of where to place trust was a bad one, lets try some
> other ones:
>
> The Postgres process; do you trust that the database engine is secure?
> This implies that the frontend program can send the user's secret to the
> database engine and the decryption will be done "inside" the database.
> I believe this to be the case, otherwise for the user to query on SSN,
> to pick an example you were using before, you would need to send *every*
> encrypted SSN to the client where they would decrypt it with their secret
> to find the one they wanted.
>
> Backups; you mentioned that if someone stole the backups they shouldn't
> be able to get any more information than if they were using the client
> interface.  If every sensitive field is encrypted then you're protected
> against some attacks, but you'd be better encrypting the backup.  Where
> is it OK to place the trust here?

Nowhere, really.  The goal is not to trust any one part of the system.
As a result, we can protect the data across multiple security failures.
For example, backups will actually be encrypted twice.  In order to
recover data from our backups, an attacker would have to physically
acquire a tape, then steal or brute-force 2 different encryption
keys before they could access the data.

In the end, the only person that should be trusted with the data are
the users who own the data.  That person will explicitly grant access
to their data to program administrators, the software will simply
facilitate the process.

We put as many layers as we can everywhere.  Usually this is limited
when we start hitting performance issues and have to remove a layer
to keep performance where it needs to be.  The goal is to have as
many layers as possible while keeping the system as performant as
the client expects.

We only get one shot at this.  If there's a data leak, a lot of
people are going to be very upset and we're going to be out of
business, so we're implementing the tightest security possible
at every layer.  This thread is only one part of the overall
process as it specifically relates to the database layer.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Looking for advice on database encryption

From

Will Harrower

Date:

17 April 2009, 12:08:55

Bill Moran wrote:
> John R Pierce <pierce@hogranch.com> wrote:
>
>> Eric Soroos wrote:
>>
>>>> an index on the encrypted SSN field would do this just fine.     if
>>>> authorized person needs to find the record with a specific SSN, they
>>>> encrypt that SSN and then look up the ciphertext in the database...
>>>> done.
>>>>
>>> This will only work for e(lectronic?) code book ciphers, and not
>>> chained block ciphers, since the initialization vector will randomize
>>> the output of the encryption so that E(foo) != E(foo) just to prevent
>>> this sort of attack.
>>>
>> can those sorts of chained block ciphers decode blocks in a different
>> order than they were originally encoded?    for this sort of
>> application, wouldn't each field or record pretty much have to be
>> encrypted discretely so that they can be decrypted in any order, or any
>> single record be decrypted on its own?
>>
>
> Eric is right about CBC ciphers.  The problem is that any function that
> will produce the same output for the same input (such as md5 or sha) leaves
> us open to brute force attacks if the number of choices is small, or
> pattern discovery attacks in other cases.  And anything that protects
> us against such attacks (such as aes-cbc) will generate data that I
> can't pre-encrypt and search against.
>
> I haven't tried it, but I don't believe CBC ciphers can decrypt data out
> of order.
>
> In the implementation I've built, the IV is stored with the ciphertext,
> much the same way that crypt() stores the salt with the password hash.
> As a result, if you have the key, you then have all the data required
> to decrypt the field, but you can't easily brute force it or do any
> pattern analysis.
>
>

Searching encrypted data is difficult in a situation like this. There is
research (e.g. [1,2]) into encrypting relatively large _text_ fields so
that the ciphertext is amenable to search, but in general all the
schemes sacrifice functionality for security - partial matches, for
example, are relatively difficult to achieve and regular expressions are
virtually impossible without manual expansion. Plus you'd have to roll
your own, which would be prone to error.

My current university project is in this area, developing a system for
PostgreSQL that allows secure search on encrypted _text_ fields (e.g.
full documents) using [2], but for a field as small as a SSN, there
isn't really anything obvious.

Having said that, using a block cipher in ECB mode on the SSN should be
enough to be able to perform fast exact-matches (based on my limited
knowledge of SSNs). Assuming the 'user' table is normalised so all the
SSNs in it will be unique, the possibility of frequency analysis on the
ciphertext is slim, especially since a 9 digit SSN encoded as ASCII will
easily fit into a single block of most recent ciphers (AES has a 16-byte
block for example). Each SSN's ciphertext will be as unique as the
original. Similarly for a one-way hash function. An SSN will only have
1billion possible combinations, so a brute force attack would be
possible, but I don't see a way of avoiding this.

You could look into stream ciphers for left-most matches, but these are
almost always susceptible to statistical attacks when used incorrectly.

Generally speaking, searching requires a pattern, which leads to
possible attacks. I think you'll have to either put up with the
inefficiency or sacrifice some amount of security.

Cheers,
Will.

[1] http://www.cs.berkeley.edu/~dawnsong/papers/se.pdf
[2] http://gnunet.org/papers/secureindex.pdf

Re: Looking for advice on database encryption

From

Sam Mason

Date:

17 April 2009, 12:15:01

On Fri, Apr 17, 2009 at 10:33:15AM -0400, Bill Moran wrote:
> The goal is not to trust any one part of the system.
> As a result, we can protect the data across multiple security failures.

As far as I know this isn't a good way to go about designing secure
systems.  You're better off defining a set of plausible attack vectors
and design the system so it's not vulnerable to them.

> For example, backups will actually be encrypted twice.  In order to
> recover data from our backups, an attacker would have to physically
> acquire a tape, then steal or brute-force 2 different encryption
> keys before they could access the data.

OK, so there would be two vectors here, each entity would have their own
secret they use the encrypt the backup.  Breaking the backup requires
the collusion (remember security is normally much easier to break from
the inside) or compromise of both entities.  The question then becomes
what happens with the backup before it's been encrypted by both entities
and how can you ensure that one can't compromise the entire system?

Your trust would therefore be that at most one entity will be
compromised.  Once you've said that you can ask yourself whether
this is appropriate.  Saying that you're encrypting data twice is,
from a security perspective, vacuous because you're leaving open the
possibility of encrypting it twice in series on the same server and
hence an attacker just needs to compromise that one server to recover
both keys and break the system.

If you're defining this single server as trusted then you're assuming
you're capable of protecting this server from attacks.  Other
definitions would partition the server into parts (say different user
accounts, and then you need to worry about privilege escalation) and
say that some parts are trusted.  The "root" user is by definition
a trusted entity in any Unix system; Microsoft tried to break this
assumption in Windows but I think they gave up in the end.  The "root"
user will normally carry more authority[1] than normal users, but it
should not need absolute authority as is does in conventional operating
systems.

> In the end, the only person that should be trusted with the data are
> the users who own the data.  That person will explicitly grant access
> to their data to program administrators, the software will simply
> facilitate the process.

OK, but more definitions are required to understand what you mean by
this statement.  You don't need to send them all here, but if I was
designing something similar to what you are I'd want to know what I was
up against.

> We put as many layers as we can everywhere.  Usually this is limited
> when we start hitting performance issues and have to remove a layer
> to keep performance where it needs to be.  The goal is to have as
> many layers as possible while keeping the system as performant as
> the client expects.

I, personally, wouldn't recommend doing things this way.

I'd define where you put your trust and design around that.  You'll
get a much stronger system and you'll know where to invest time in
protecting it because you know what you trust and if those items are
compromised everything falls down.  If you don't know what you're
trusting you don't know where to expend the effort.

> We only get one shot at this.  If there's a data leak, a lot of
> people are going to be very upset and we're going to be out of
> business, so we're implementing the tightest security possible
> at every layer.  This thread is only one part of the overall
> process as it specifically relates to the database layer.

Yes, that sounds reasonable and to be expected.

Hope that's all somewhat helpful!

--
  Sam  http://samason.me.uk/

 [1] I'm using "authority" in a technical sense meaning the set of
 actions that an entity can directly, or indirectly by talking to other
 actor in the system, cause to occur.  Strictly speaking from a security
 perspective there is no difference between a normal user and root in
 UNIX system because of the presence of various privilege escalating
 mechanisms (i.e. set suid bit on executables).  In practise it's OK,
 but not safe, to assume that the system is configured in a way that
 this doesn't occur and any occurrences are "bugs".