Re: Fast Search on Encrypted Feild - Mailing list pgsql-general

From Merlin Moncure
Subject Re: Fast Search on Encrypted Feild
Date
Msg-id b42b73150911150629u77d4f36agfc77dce74658a04c@mail.gmail.com
Whole thread Raw
In response to Re: Fast Search on Encrypted Feild  (John R Pierce <pierce@hogranch.com>)
List pgsql-general
On Sat, Nov 14, 2009 at 5:08 PM, John R Pierce <pierce@hogranch.com> wrote:
> Naoko Reeves wrote:
>>
>> I have a encrypted column use encrypt function.
>>
>> Querying against this column is almost not acceptable – returning 12 rows
>> took 25,908 ms.
>>
>> The query was simply Select decrypt(phn_phone_enc) FROM phn WHERE
>> decrypt(phn_phone_enc,’xxx’,’xxx’) LIKE ‘123%’
>>
>> So I built index like: CREATE INDEX idx_phn_phone_dec ON phn
>> (decrypt(phn_phone_enc, ‘xxx’, ‘xxx’))
>>
>> This returns 12 rows in 68 ms.
>>
>> Would this be the solution for the fast encrypted field search or does
>> this raise the security issue?
>>
>
> very ugly for all the reasons given by the others.
>
> if all your LIKE searches are specifically for the first 3 chars of the
> phn_phone field, I dunno, but maybe you could have a seperate field (and
> index) which consists of just the first three characters encrypted.
>
> this would, however, allow people with access to the database but without
> the cipher key to group your data by this prefixm, and potentially the
> knowlege of that grouping could be used as an aid in cracking the keys.

I'm not good enough with encryption to know for sure, but I suspect
that doing this would give a _determined_ attacker enough information
to break into your data (just for starters, you reduce the pool of
numbers that would have to be guessed by a few orders of magnitude,
and you give away geographical location).  By determined I mean
someone with a complete copy of your database, time, cpu power, and a
bad attitude :-).  I'm not saying not to do it, but the risks should
be known.  If you are just looking to block casual access to the data,
it's probably ok.  Even a single digit would reduce the brute force
search by 90% if the numbers distribute well.

That said, I'm betting you can't really budget 25 seconds of cpu time
for a common lookup search.  So you either have to compromise security
or re-think the way you are doing lookups...

merlin

merlin

pgsql-general by date:

Previous
From: Johan Nel
Date:
Subject: Re: Experience with many schemas vs many databases
Next
From: "Naoko Reeves"
Date:
Subject: Re: Fast Search on Encrypted Feild