Re: Fast Search on Encrypted Feild - Mailing list pgsql-general

From Bill Moran
Subject Re: Fast Search on Encrypted Feild
Date
Msg-id 20091114184034.acadeed8.wmoran@potentialtech.com
Whole thread Raw
In response to Re: Fast Search on Encrypted Feild  ("Naoko Reeves" <naoko@lawlogix.com>)
List pgsql-general
"Naoko Reeves" <naoko@lawlogix.com> wrote:
>
> Merlin,
> Thank you for your quick response. I see... our security requirements are:
> We are encrypting PII information within our DB and because of the sensitive nature of our data, we must balancing
bothperformance and security to meet our client requirements. 
> Our clients are mainly lawyers and handles clients case (government, healthcare, education).
> If you could provide me any advice that would be great otherwise I understand that I have to go without wildcard
search.

For the most part, you have either performance or security, but not both.
As others have pointed out, anything you do to speed up searches basically
results in storing an unencrypted version of the data in an index.

Storing the data redundantly is often worthwhile.  For example, in the US,
it's generally considered proper to store the SSN encrypted, because it's
a target for identity theft, but the last 4 digits of the SSN aren't
considered enough information to steal someone's identity, so they are
usually stored unencrypted, and thus provide fast searching.  Since this
convention is so common in the US, most people search on the last 4 digits
anyway, so it works out well.

Another option that might be worthwhile is deidentifying the data, which is
a practice often done for HIPPA-protected information.  I was hoping to
point you toward a good reference on how to do this on the web, but Google
is failing me.

The basic technique is to store the protected data in a different table, then
encrypt the key that links those two tables together.  Of course, the database
can no longer enforce referential integrity at that point, and it's totally
on your application to manage that.  By doing so, you get speedy, indexed
searches, and the decryption overhead is only felt when it's time to
decrypt the foreign key and link the data.  For a query that's selecting
50 rows out of 50,000, this is a huge win, and in the worst case, it's still
no worse than encrypting the data itself.

--
Bill Moran
http://www.potentialtech.com

pgsql-general by date:

Previous
From: Thom Brown
Date:
Subject: Re: [pgeu-general] pgday.eu
Next
From: Bill Todd
Date:
Subject: Re: safelly erasing dirs/files