Re: PRIVATE columns - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: PRIVATE columns
Date
Msg-id 50C95388.7000107@Yahoo.com
Whole thread Raw
In response to Re: PRIVATE columns  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 12/12/2012 3:12 PM, Simon Riggs wrote:
> On 12 December 2012 19:13, Jan Wieck <JanWieck@yahoo.com> wrote:
>> On 12/12/2012 1:12 PM, Simon Riggs wrote:
>>>
>>> Currently, ANALYZE collects data on all columns and stores these
>>> samples in pg_statistic where they can be seen via the view pg_stats.
>>>
>>> In some cases we have data that is private and we do not wish others
>>> to see it, such as patient names. This becomes more important when we
>>> have row security.
>>>
>>> Perhaps that data can be protected, but it would be even better if we
>>> simply didn't store value-revealing statistic data at all. Such
>>> private data is seldom the target of searches, or if it is, it is
>>> mostly evenly distributed anyway.
>>
>>
>> Would protecting it the same way, we protect the passwords in pg_authid, be
>> sufficient?
>
> The user backend does need to be able to access the stats data during
> optimization. It's hard to have data accessible and yet impose limits
> on the uses to which that can be put. If we have row security on the
> table but no equivalent capability on the stats, then we'll have
> leakage. e.g. set statistics 10000, ANALYZE, then leak 10000 credit
> card numbers.

Like for the encrypted password column of pg_authid, I don't see any 
reason why the values in the stats columns need to be readable for 
anyone but a superuser at all. Do you?


Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PERFORM] encouraging index-only scans
Next
From: Pavan Deolasee
Date:
Subject: Re: [PERFORM] encouraging index-only scans