Re: Anonymized database dumps - Mailing list pgsql-general

From Kiriakos Georgiou
Subject Re: Anonymized database dumps
Date
Msg-id B573707D-4223-4CFB-9172-935E1322475D@olympiakos.com
Whole thread Raw
In response to Re: Anonymized database dumps  (Bill Moran <wmoran@potentialtech.com>)
List pgsql-general
On Mar 19, 2012, at 5:55 PM, Bill Moran wrote:

>
>> Sensitive data should be stored encrypted to begin.  For test databases you or your developers can invoke a process
thatreplaces the real encrypted data with fake encrypted data (for which everybody has the key/password.)  Or if the
overheadis too much (ie billions of rows), you can have different decrypt() routines on your test databases that return
fakedata without touching the real encrypted columns. 
>
> The thing is, this process has the same potential data spillage
> issues as sanitizing the data.


Not really, in the modality I describe the sensitive data is always encrypted in the database and "useless" because
nobodywill have the private key or know the password that protects it other than the ops subsystems that require
access.
So even if you take an ops dump, load it to a test box, and walk away, you are good.  If your developers/testers want
toplay with the data they will be forced to over-write and "stage" test encrypted data they can decrypt, or call a
"fake"decrypt() that gives them test data (eg: joins to a test data table.) 

Kiriakos

pgsql-general by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: nice'ing the postgres COPY backend process to make pg_dumps run more "softly"
Next
From: Tom Lane
Date:
Subject: Re: WHERE IN (subselect) versus WHERE IN (1,2,3,)