Thread: postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.

Hi,

 

One of my co-workers came out of a NIST cyber-security type meeting today and asked me to delve into postgres and zeroization.

 

I am casually aware of mvcc issues and vacuuming

 

I believe the   concern,  based on my current understanding  of postgres inner workings,  is  that when a dead tuple is reclaimed by vacuuming:  Is that reclaimed space initialized in some fashion that would  shred any sensitive data that was formerly there to any  inspection by  the subsequent owner of  that disk page ? ( zeroization )

 

Not sure that is the exact question to ask but hopefully you get a feel for the requirement is  not to  leave any sensitive data laying about for

recovery by a hacker,  or at least minimize the places it could be obtained without actually being able to log into postgres or having raw disk access privileges.  

 

Thanks for any comments/instruction/links on the matter.

 

 

Regards

 

 

Dave Day

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Re: postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.

From
"David G. Johnston"
Date:
On Wed, Nov 18, 2015 at 12:45 PM, Day, David <dday@redcom.com> wrote:

Hi,

 

One of my co-workers came out of a NIST cyber-security type meeting today and asked me to delve into postgres and zeroization.

 

I am casually aware of mvcc issues and vacuuming

 

I believe the   concern,  based on my current understanding  of postgres inner workings,  is  that when a dead tuple is reclaimed by vacuuming:  Is that reclaimed space initialized in some fashion that would  shred any sensitive data that was formerly there to any  inspection by  the subsequent owner of  that disk page ? ( zeroization )

 

Not sure that is the exact question to ask but hopefully you get a feel for the requirement is  not to  leave any sensitive data laying about for

recovery by a hacker,  or at least minimize the places it could be obtained without actually being able to log into postgres or having raw disk access privileges.  

 

Thanks for any comments/instruction/links on the matter.



​"""
Plain VACUUM (without FULL) simply reclaims space and makes it available for re-use. This form of the command can operate in parallel with normal reading and writing of the table, as an exclusive lock is not obtained. However, extra space is not returned to the operating system (in most cases); it's just kept available for re-use within the same table. VACUUM FULL rewrites the entire contents of the table into a new disk file with no extra space, allowing unused space to be returned to the operating system. This form is much slower and requires an exclusive lock on each table while it is being processed.
​"""​

​So:

1) Does VACUUM FULL perform any post-rewrite action to obliterate previous disk file?
2) Does the ready to be reused space get initialized to "zeros" during a normal VACUUM or do the previous tuple contents exist there until they are next overwritten?

Unfortunately I do not know the answers and don't wish to hazard a guess.

I'm not certain what mechanics you envision that would allow one to access this dead space without having raw disk access privileges.  In the case of VACUUM FULL PostgreSQL gives back control of the relevant file to the O/S and supposedly cannot regain access to it in any reliable (i.e., interpretable as PostgreSQL data) sense.

David J.



David G. Johnston wrote:
> On Wed, Nov 18, 2015 at 12:45 PM, Day, David <dday@redcom.com> wrote:

> > I believe the   concern,  based on my current understanding  of postgres
> > inner workings,  is  that when a dead tuple is reclaimed by vacuuming:  Is
> > that reclaimed space initialized in some fashion that would  shred any
> > sensitive data that was formerly there to any  inspection by  the
> > subsequent owner of  that disk page ? ( zeroization )

No.  Ultimately, space occupied by dead tuples is "freed" in
PageRepairFragmentation(), src/backend/storage/page/bufpage.c;
the contents of the tuples are shuffled to "defragment" the free space,
but the free space is not zeroed.  You could certainly try to read the
unused page and extract some data from there.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Which begs the question, what is more important, the old/vacuumed data, or the current valid data?
If someone can hack into the freed data, then they certainly have the ability to hack into the current valid data.
So ultimately, the best thing to do is to secure the system from being hacked, not zero out old data.
AFAIK, the only time you need to zero out the bytes is when you are decommissioning the disk, in which case ALL data on the disk needs to be wiped.

On Wed, Nov 18, 2015 at 3:13 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
David G. Johnston wrote:
> On Wed, Nov 18, 2015 at 12:45 PM, Day, David <dday@redcom.com> wrote:

> > I believe the   concern,  based on my current understanding  of postgres
> > inner workings,  is  that when a dead tuple is reclaimed by vacuuming:  Is
> > that reclaimed space initialized in some fashion that would  shred any
> > sensitive data that was formerly there to any  inspection by  the
> > subsequent owner of  that disk page ? ( zeroization )

No.  Ultimately, space occupied by dead tuples is "freed" in
PageRepairFragmentation(), src/backend/storage/page/bufpage.c;
the contents of the tuples are shuffled to "defragment" the free space,
but the free space is not zeroed.  You could certainly try to read the
unused page and extract some data from there.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> David G. Johnston wrote:
>> On Wed, Nov 18, 2015 at 12:45 PM, Day, David <dday@redcom.com> wrote:
>>> I believe the   concern,  based on my current understanding  of postgres
>>> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:  Is
>>> that reclaimed space initialized in some fashion that would  shred any
>>> sensitive data that was formerly there to any  inspection by  the
>>> subsequent owner of  that disk page ? ( zeroization )

> No.  Ultimately, space occupied by dead tuples is "freed" in
> PageRepairFragmentation(), src/backend/storage/page/bufpage.c;
> the contents of the tuples are shuffled to "defragment" the free space,
> but the free space is not zeroed.  You could certainly try to read the
> unused page and extract some data from there.

It's quite unclear to me what threat model such a behavior would add
useful protection against.

            regards, tom lane


On 11/18/2015 11:45 AM, Day, David wrote:
> I believe the   concern,  based on my current understanding  of
> postgres inner workings,  is  that when a dead tuple is reclaimed by
> vacuuming:  Is that reclaimed space initialized in some fashion that
> would  shred any sensitive data that was formerly there to any
>  inspection by  the subsequent owner of  that disk page ? ( zeroization )

the postgres server owns the pages.   AFAIK, the only way to read raw
pages is if you can impersonate the server and directly access the raw
files, or if you have postgres superuser privileges and use the
pg_read_binary_file() functions.     no 'normal' client app will be able
to see raw pages, or data thats not a valid part of a table that client
has permissions to read.


--
john r pierce, recycling bits in santa cruz



On 11/18/2015 11:45 AM, Day, David wrote:
> Hi,
>
> One of my co-workers came out of a NIST cyber-security type meeting
> today and asked me to delve into postgres and zeroization.
>
> I am casually aware of mvcc issues and vacuuming
>
> I believe the   concern,  based on my current understanding  of postgres
> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:
>   Is that reclaimed space initialized in some fashion that would  shred
> any sensitive data that was formerly there to any  inspection by  the
> subsequent owner of  that disk page ? ( zeroization )
>
> Not sure that is the exact question to ask but hopefully you get a feel
> for the requirement is  not to  leave any sensitive data laying about for
>
> recovery by a hacker,  or at least minimize the places it could be
> obtained without actually being able to log into postgres or having raw
> disk access privileges.

Per Melvins post, what makes the old pages any more valuable for hacking
then the current pages?

>
> Thanks for any comments/instruction/links on the matter.
>
> Regards
>
> Dave Day
>


--
Adrian Klaver
adrian.klaver@aklaver.com


On 11/18/2015 11:45 AM, Day, David wrote:
> Hi,
>
> One of my co-workers came out of a NIST cyber-security type meeting
> today and asked me to delve into postgres and zeroization.
>
> I am casually aware of mvcc issues and vacuuming
>
> I believe the   concern,  based on my current understanding  of postgres
> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:
>   Is that reclaimed space initialized in some fashion that would  shred
> any sensitive data that was formerly there to any  inspection by  the
> subsequent owner of  that disk page ? ( zeroization )

Got to thinking, are you talking about a physical machine or a
VM/container on shared hosting? If the latter then it is a more generic
problem of detritus left behind between creations of virtual instances
or cross talk on shared storage.

>
> Not sure that is the exact question to ask but hopefully you get a feel
> for the requirement is  not to  leave any sensitive data laying about for
>
> recovery by a hacker,  or at least minimize the places it could be
> obtained without actually being able to log into postgres or having raw
> disk access privileges.
>
> Thanks for any comments/instruction/links on the matter.
>
> Regards
>
> Dave Day
>


--
Adrian Klaver
adrian.klaver@aklaver.com


-----Original Message-----
From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
Sent: Wednesday, November 18, 2015 3:47 PM
To: Day, David; pgsql-general@postgresql.org
Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.

On 11/18/2015 11:45 AM, Day, David wrote:
> Hi,
>
> One of my co-workers came out of a NIST cyber-security type meeting
> today and asked me to delve into postgres and zeroization.
>
> I am casually aware of mvcc issues and vacuuming
>
> I believe the   concern,  based on my current understanding  of postgres
> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:
>   Is that reclaimed space initialized in some fashion that would
> shred any sensitive data that was formerly there to any  inspection by
> the subsequent owner of  that disk page ? ( zeroization )

Got to thinking, are you talking about a physical machine or a VM/container on shared hosting? If the latter then it is
amore generic problem of detritus left behind between creations of virtual instances or cross talk on shared storage. 

>
> Not sure that is the exact question to ask but hopefully you get a
> feel for the requirement is  not to  leave any sensitive data laying
> about for
>
> recovery by a hacker,  or at least minimize the places it could be
> obtained without actually being able to log into postgres or having
> raw disk access privileges.
>
> Thanks for any comments/instruction/links on the matter.
>
> Regards
>
> Dave Day
>


--
Adrian Klaver
adrian.klaver@aklaver.com

In some instances this would be a vm instance on a hosted machine in other cases a actual physical machine.

Thank you all for the feedback.


All good points.  I am not sure what the manner of attack/hack is until I get some further feedback out of the meeting
participants. I suspect it would be to the blocks pages released by postgres following a vacuum full. 
How you determine what those pages blocks were I am not sure but suspect there is probably a way.
When I get some more detail on the standard and exact requirement I will repost with that info.


Again thanks



Dave Day






On 11/18/2015 12:57 PM, Day, David wrote:
>
> -----Original Message-----
> From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
> Sent: Wednesday, November 18, 2015 3:47 PM
> To: Day, David; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
>
> On 11/18/2015 11:45 AM, Day, David wrote:
>> Hi,
>>
>> One of my co-workers came out of a NIST cyber-security type meeting
>> today and asked me to delve into postgres and zeroization.
>>
>> I am casually aware of mvcc issues and vacuuming
>>
>> I believe the   concern,  based on my current understanding  of postgres
>> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:
>>    Is that reclaimed space initialized in some fashion that would
>> shred any sensitive data that was formerly there to any  inspection by
>> the subsequent owner of  that disk page ? ( zeroization )
>
> Got to thinking, are you talking about a physical machine or a VM/container on shared hosting? If the latter then it
isa more generic problem of detritus left behind between creations of virtual instances or cross talk on shared
storage.
>
>>
>> Not sure that is the exact question to ask but hopefully you get a
>> feel for the requirement is  not to  leave any sensitive data laying
>> about for
>>
>> recovery by a hacker,  or at least minimize the places it could be
>> obtained without actually being able to log into postgres or having
>> raw disk access privileges.
>>
>> Thanks for any comments/instruction/links on the matter.
>>
>> Regards
>>
>> Dave Day
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
> In some instances this would be a vm instance on a hosted machine in other cases a actual physical machine.
>
> Thank you all for the feedback.
>
>
> All good points.  I am not sure what the manner of attack/hack is until I get some further feedback out of the
meetingparticipants.  I suspect it would be to the blocks pages released by postgres following a vacuum full. 
> How you determine what those pages blocks were I am not sure but suspect there is probably a way.
> When I get some more detail on the standard and exact requirement I will repost with that info.

Yes, a detailed problem description would be helpful.

>
>
> Again thanks
>
>
>
> Dave Day
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


On Wed, Nov 18, 2015 at 03:22:44PM -0500, Tom Lane wrote:
> It's quite unclear to me what threat model such a behavior would add
> useful protection against.

If you had some sort of high-security database and deleted some data
from it, it's important for the threat modeller to know whether the
data is gone-as-in-overwritten or gone-as-in-marked-free.  This is the
same reason they want to know whether a deleted file is actually just
unlinked on the disk.

This doesn't mean one thing is better than another; just that, if
you're trying to understand what data could possibly be exfiltrated,
you need to know the state of all of it.

For realistic cases, I expect that deleted data is usually more
important than updated data.  But a threat modeller needs to
understand all these variables anyway.

A

--
Andrew Sullivan
ajs@crankycanuck.ca


On 11/18/2015 01:34 PM, Andrew Sullivan wrote:
> On Wed, Nov 18, 2015 at 03:22:44PM -0500, Tom Lane wrote:
>> It's quite unclear to me what threat model such a behavior would add
>> useful protection against.
>
> If you had some sort of high-security database and deleted some data
> from it, it's important for the threat modeller to know whether the
> data is gone-as-in-overwritten or gone-as-in-marked-free.  This is the
> same reason they want to know whether a deleted file is actually just
> unlinked on the disk.
>
> This doesn't mean one thing is better than another; just that, if
> you're trying to understand what data could possibly be exfiltrated,
> you need to know the state of all of it.
>
> For realistic cases, I expect that deleted data is usually more
> important than updated data.  But a threat modeller needs to
> understand all these variables anyway.

Alright, I was following you up to this. Seems to me deleted data would
represent stale/old data and would be less valuable.
>
> A
>


--
Adrian Klaver
adrian.klaver@aklaver.com




On Wed, Nov 18, 2015 at 4:38 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

Alright, I was following you up to this. Seems to me deleted data would represent stale/old data and would be less valuable.


It may depend on WHY the data was deleted. If it represented, say, Hillary Clinton's deleted email, recovering that data might be more valuable to some people than the data that was not deleted.
--
Mike Nolan 
'm still trying to understand why you think someone can access old data but not current/live data.
If you encrypt the live data, wouldn't that solve both concerns?

On Wed, Nov 18, 2015 at 4:38 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 11/18/2015 01:34 PM, Andrew Sullivan wrote:
On Wed, Nov 18, 2015 at 03:22:44PM -0500, Tom Lane wrote:
It's quite unclear to me what threat model such a behavior would add
useful protection against.

If you had some sort of high-security database and deleted some data
from it, it's important for the threat modeller to know whether the
data is gone-as-in-overwritten or gone-as-in-marked-free.  This is the
same reason they want to know whether a deleted file is actually just
unlinked on the disk.

This doesn't mean one thing is better than another; just that, if
you're trying to understand what data could possibly be exfiltrated,
you need to know the state of all of it.

For realistic cases, I expect that deleted data is usually more
important than updated data.  But a threat modeller needs to
understand all these variables anyway.

Alright, I was following you up to this. Seems to me deleted data would represent stale/old data and would be less valuable.

A



--
Adrian Klaver
adrian.klaver@aklaver.com



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

On Wed, Nov 18, 2015 at 3:38 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 11/18/2015 01:34 PM, Andrew Sullivan wrote:
On Wed, Nov 18, 2015 at 03:22:44PM -0500, Tom Lane wrote:
It's quite unclear to me what threat model such a behavior would add
useful protection against.

If you had some sort of high-security database and deleted some data
from it, it's important for the threat modeller to know whether the
data is gone-as-in-overwritten or gone-as-in-marked-free.  This is the
same reason they want to know whether a deleted file is actually just
unlinked on the disk.

This doesn't mean one thing is better than another; just that, if
you're trying to understand what data could possibly be exfiltrated,
you need to know the state of all of it.

For realistic cases, I expect that deleted data is usually more
important than updated data.  But a threat modeller needs to
understand all these variables anyway.

Alright, I was following you up to this. Seems to me deleted data would represent stale/old data and would be less valuable.

​Not necessarily. Think PHI or HIPAA information which was "erased" because you lost a customer. ​Or just something as "simple" as a name, address, and credit card number for someone. It's still important and useful to thieves if it is "erase". I can see a smaller company using PG for accounting and billing information. But it really should be encrypted. I often wonder how many "small" businesses actually do that. I a truly ignorant on that point.

That's not even getting into government information that might be of interest to others such as the FSB or even Wikileaks (regardless of one's opinion them). Of course, I don't really know if any government or other "high security" industry is actually using PG for secure information.


--
Adrian Klaver
adrian.klaver@aklaver.com


--

Schrodinger's backup: The condition of any backup is unknown until a restore is attempted.

Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown
On Wed, Nov 18, 2015 at 01:38:47PM -0800, Adrian Klaver wrote:
> Alright, I was following you up to this. Seems to me deleted data would
> represent stale/old data and would be less valuable.

If the data that was deleted is sensitive, then the fact that you
deleted it but that it didn't actually go away means you can be lulled
into complacency about your vulnerability with respect to that data in
a way that you're unlikely to be in respect of data you still have
(only with new values).  Lots of people forget about deleted data once
it's deleted.

Keep in mind that sometimes people delete data from a system because
it's been archived somewhere else or something like that -- not all
databases have the totality of all the relevant data in them, but can
often represent just "current" data.

Best regards,

A

--
Andrew Sullivan
ajs@crankycanuck.ca


On Wed, Nov 18, 2015 at 04:46:11PM -0500, Melvin Davidson wrote:
> 'm still trying to understand why you think someone can access old data but
> not current/live data.

I don't.  It's just another risk.  When you're making a list of risks,
you need to list them all.  It turns out that in Postgres, you have to
worry about (1) data that's currently in the database and (2) some
data that used to be there but isn't now.

> If you encrypt the live data, wouldn't that solve both concerns?

I have no idea, because I don't know what the theoretical risk to be
mitigated is.  It might, sure.  The security profiler would still need
to make a list of this fact and then ask how countermeasures mitigate
it.

Best regards,

A

--
Andrew Sullivan
ajs@crankycanuck.ca


On 11/18/2015 01:46 PM, Michael Nolan wrote:
>
>
> On Wed, Nov 18, 2015 at 4:38 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>
>     Alright, I was following you up to this. Seems to me deleted data
>     would represent stale/old data and would be less valuable.
>
>
>
> It may depend on WHY the data was deleted. If it represented, say,
> Hillary Clinton's deleted email, recovering that data might be more
> valuable to some people than the data that was not deleted.

Aah yes, did not have my devious mode turned on previously. I can see
old data being more important now.

> --
> Mike Nolan


--
Adrian Klaver
adrian.klaver@aklaver.com


On 11/18/2015 01:49 PM, John McKown wrote:
> On Wed, Nov 18, 2015 at 3:38 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>>wrote:
>
>     On 11/18/2015 01:34 PM, Andrew Sullivan wrote:
>
>         On Wed, Nov 18, 2015 at 03:22:44PM -0500, Tom Lane wrote:
>
>             It's quite unclear to me what threat model such a behavior
>             would add
>             useful protection against.
>
>
>         If you had some sort of high-security database and deleted some data
>         from it, it's important for the threat modeller to know whether the
>         data is gone-as-in-overwritten or gone-as-in-marked-free.  This
>         is the
>         same reason they want to know whether a deleted file is actually
>         just
>         unlinked on the disk.
>
>         This doesn't mean one thing is better than another; just that, if
>         you're trying to understand what data could possibly be exfiltrated,
>         you need to know the state of all of it.
>
>         For realistic cases, I expect that deleted data is usually more
>         important than updated data.  But a threat modeller needs to
>         understand all these variables anyway.
>
>
>     Alright, I was following you up to this. Seems to me deleted data
>     would represent stale/old data and would be less valuable.
>
>
> ​Not necessarily. Think PHI or HIPAA information which was "erased"
> because you lost a customer. ​Or just something as "simple" as a name,
> address, and credit card number for someone. It's still important and
> useful to thieves if it is "erase". I can see a smaller company using PG
> for accounting and billing information. But it really should be
> encrypted. I often wonder how many "small" businesses actually do that.
> I a truly ignorant on that point.

Well from the large scale leaks that have been reported, large
companies/organizations are not doing it either. I have credit watch on
my accounts courtesy of my health insurer(Premara) as they did not
protect my information.

>
> That's not even getting into government information that might be of
> interest to others such as the FSB or even Wikileaks (regardless of
> one's opinion them). Of course, I don't really know if any government or
> other "high security" industry is actually using PG for secure information.
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
> --
>
> Schrodinger's backup: The condition of any backup is unknown until a
> restore is attempted.
>
> Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.
>
> He's about as useful as a wax frying pan.
>
> 10 to the 12th power microphones = 1 Megaphone
>
> Maranatha! <><
> John McKown


--
Adrian Klaver
adrian.klaver@aklaver.com


On 11/18/2015 01:51 PM, Andrew Sullivan wrote:
> On Wed, Nov 18, 2015 at 01:38:47PM -0800, Adrian Klaver wrote:
>> Alright, I was following you up to this. Seems to me deleted data would
>> represent stale/old data and would be less valuable.
>
> If the data that was deleted is sensitive, then the fact that you
> deleted it but that it didn't actually go away means you can be lulled
> into complacency about your vulnerability with respect to that data in
> a way that you're unlikely to be in respect of data you still have
> (only with new values).  Lots of people forget about deleted data once
> it's deleted.

Yet another avenue I failed to see. Interesting discussion.

>
> Keep in mind that sometimes people delete data from a system because
> it's been archived somewhere else or something like that -- not all
> databases have the totality of all the relevant data in them, but can
> often represent just "current" data.
>
> Best regards,
>
> A
>


--
Adrian Klaver
adrian.klaver@aklaver.com



-----Original Message-----
From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
Sent: Wednesday, November 18, 2015 4:05 PM
To: Day, David; pgsql-general@postgresql.org
Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.

On 11/18/2015 12:57 PM, Day, David wrote:
>
> -----Original Message-----
> From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
> Sent: Wednesday, November 18, 2015 3:47 PM
> To: Day, David; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
>
> On 11/18/2015 11:45 AM, Day, David wrote:
>> Hi,
>>
>> One of my co-workers came out of a NIST cyber-security type meeting
>> today and asked me to delve into postgres and zeroization.
>>
>> I am casually aware of mvcc issues and vacuuming
>>
>> I believe the   concern,  based on my current understanding  of postgres
>> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:
>>    Is that reclaimed space initialized in some fashion that would
>> shred any sensitive data that was formerly there to any  inspection
>> by the subsequent owner of  that disk page ? ( zeroization )
>
> Got to thinking, are you talking about a physical machine or a VM/container on shared hosting? If the latter then it
isa more generic problem of detritus left behind between creations of virtual instances or cross talk on shared
storage.
>
>>
>> Not sure that is the exact question to ask but hopefully you get a
>> feel for the requirement is  not to  leave any sensitive data laying
>> about for
>>
>> recovery by a hacker,  or at least minimize the places it could be
>> obtained without actually being able to log into postgres or having
>> raw disk access privileges.
>>
>> Thanks for any comments/instruction/links on the matter.
>>
>> Regards
>>
>> Dave Day
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
> In some instances this would be a vm instance on a hosted machine in other cases a actual physical machine.
>
> Thank you all for the feedback.
>
>
> All good points.  I am not sure what the manner of attack/hack is until I get some further feedback out of the
meetingparticipants.  I suspect it would be to the blocks pages released by postgres following a vacuum full. 
> How you determine what those pages blocks were I am not sure but suspect there is probably a way.
> When I get some more detail on the standard and exact requirement I will repost with that info.

Yes, a detailed problem description would be helpful.

>
>
> Again thanks
>
>
>
> Dave Day
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Good Day,

The specification that was under discussion was "collaborative Protection Profile for Network Devices"
https://www.niap-ccevs.org/pp/cpp_nd_v1.0.pdf   Section 5.4.1.3 FCS_CKM.4 Crypotographic Key Destruction.


The interpreted concern with respect to postgress includes  its released non-volatile and volatile memory.

In the non-volatile case I believe this would amount to the vacuum -full scenario as commented on by David Johnston and
MelvinDavidson.  I have not looked at Tom Lanes source reference bufpage.c to see if that could be customized in the
shortterm to sanitize the page being returned. 

In the volatile memory case it would seem to include any memory used by postgres that might have included sensitive
dataand is at some point freed and therefore available in the general pool for allocation  and subject to  inspection
bythe next user. 
Perhaps there is a common routine in the source that could perform such a job ?

We are still digesting the specification and may find allowances given other security measures.

Appreciate everyone's feedback.  This is perhaps a matter that can feed into future OS ( FreeBSD ) and/or Postgress
development.


Regards


Dave Day










On Wed, Nov 18, 2015 at 3:49 PM, John McKown
<john.archie.mckown@gmail.com> wrote:
> Not necessarily. Think PHI or HIPAA information which was "erased" because
> you lost a customer. Or just something as "simple" as a name, address, and
> credit card number for someone. It's still important and useful to thieves
> if it is "erase". I can see a smaller company using PG for accounting and
> billing information. But it really should be encrypted. I often wonder how
> many "small" businesses actually do that. I a truly ignorant on that point.
>
> That's not even getting into government information that might be of
> interest to others such as the FSB or even Wikileaks (regardless of one's
> opinion them). Of course, I don't really know if any government or other
> "high security" industry is actually using PG for secure information.

It's quite a stretch to assume that HIPAA applies to internal garbage
collection minutia.  If you believe that, then you'd have apply it to
the filesystem physical media as well, including swap.   Meaning, each
time you delete a customer record, you'd have to back up and restore
the database after zeroing out the file system.  So, basically, uh,
no.

A much better way to look at compliance is to encrypt all sensitive
details and, when the customer relationship is gone, delete the key.
This puts the responsibility for information security (if taken to
that extreme) back into the application which is where it belongs.

merlin


On 11/19/2015 07:01 AM, Day, David wrote:
>
>
> -----Original Message-----
> From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
> Sent: Wednesday, November 18, 2015 4:05 PM
> To: Day, David; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
>
> On 11/18/2015 12:57 PM, Day, David wrote:
>>
>> -----Original Message-----
>> From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
>> Sent: Wednesday, November 18, 2015 3:47 PM
>> To: Day, David; pgsql-general@postgresql.org
>> Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
>>
>> On 11/18/2015 11:45 AM, Day, David wrote:
>>> Hi,
>>>
>>> One of my co-workers came out of a NIST cyber-security type meeting
>>> today and asked me to delve into postgres and zeroization.
>>>
>>> I am casually aware of mvcc issues and vacuuming
>>>
>>> I believe the   concern,  based on my current understanding  of postgres
>>> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:
>>>     Is that reclaimed space initialized in some fashion that would
>>> shred any sensitive data that was formerly there to any  inspection
>>> by the subsequent owner of  that disk page ? ( zeroization )
>>
>> Got to thinking, are you talking about a physical machine or a VM/container on shared hosting? If the latter then it
isa more generic problem of detritus left behind between creations of virtual instances or cross talk on shared
storage.
>>
>>>
>>> Not sure that is the exact question to ask but hopefully you get a
>>> feel for the requirement is  not to  leave any sensitive data laying
>>> about for
>>>
>>> recovery by a hacker,  or at least minimize the places it could be
>>> obtained without actually being able to log into postgres or having
>>> raw disk access privileges.
>>>
>>> Thanks for any comments/instruction/links on the matter.
>>>
>>> Regards
>>>
>>> Dave Day
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>> In some instances this would be a vm instance on a hosted machine in other cases a actual physical machine.
>>
>> Thank you all for the feedback.
>>
>>
>> All good points.  I am not sure what the manner of attack/hack is until I get some further feedback out of the
meetingparticipants.  I suspect it would be to the blocks pages released by postgres following a vacuum full. 
>> How you determine what those pages blocks were I am not sure but suspect there is probably a way.
>> When I get some more detail on the standard and exact requirement I will repost with that info.
>
> Yes, a detailed problem description would be helpful.
>
>>
>>
>> Again thanks
>>
>>
>>
>> Dave Day
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
> Good Day,
>
> The specification that was under discussion was "collaborative Protection Profile for Network Devices"
> https://www.niap-ccevs.org/pp/cpp_nd_v1.0.pdf   Section 5.4.1.3 FCS_CKM.4 Crypotographic Key Destruction.
>
>
> The interpreted concern with respect to postgress includes  its released non-volatile and volatile memory.
>
> In the non-volatile case I believe this would amount to the vacuum -full scenario as commented on by David Johnston
andMelvin Davidson.  I have not looked at Tom Lanes source reference bufpage.c to see if that could be customized in
theshort term to sanitize the page being returned. 
>
> In the volatile memory case it would seem to include any memory used by postgres that might have included sensitive
dataand is at some point freed and therefore available in the general pool for allocation  and subject to  inspection
bythe next user. 
> Perhaps there is a common routine in the source that could perform such a job ?
>
> We are still digesting the specification and may find allowances given other security measures.

So what are you working on?

The document you link to starts with this:
"
Examples of network devices that are covered by requirements in this cPP
include routers, firewalls, VPN gateways, IDSs, and switches. ..."

So embedded devices. Not sure how prevalent Postgres is in that area.

Also the subsection you refer to seems to be talking only about memory,
not storage which is where VACUUM FULL works. That may be an overly fine
distinction, but one that can be made.


>
> Appreciate everyone's feedback.  This is perhaps a matter that can feed into future OS ( FreeBSD ) and/or Postgress
development.
>
>
> Regards
>
>
> Dave Day
>
>
>
>
>
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com



-----Original Message-----
From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
Sent: Thursday, November 19, 2015 10:32 AM
To: Day, David; pgsql-general@postgresql.org
Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.

On 11/19/2015 07:01 AM, Day, David wrote:
>
>
> -----Original Message-----
> From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
> Sent: Wednesday, November 18, 2015 4:05 PM
> To: Day, David; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
>
> On 11/18/2015 12:57 PM, Day, David wrote:
>>
>> -----Original Message-----
>> From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
>> Sent: Wednesday, November 18, 2015 3:47 PM
>> To: Day, David; pgsql-general@postgresql.org
>> Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
>>
>> On 11/18/2015 11:45 AM, Day, David wrote:
>>> Hi,
>>>
>>> One of my co-workers came out of a NIST cyber-security type meeting
>>> today and asked me to delve into postgres and zeroization.
>>>
>>> I am casually aware of mvcc issues and vacuuming
>>>
>>> I believe the   concern,  based on my current understanding  of postgres
>>> inner workings,  is  that when a dead tuple is reclaimed by vacuuming:
>>>     Is that reclaimed space initialized in some fashion that would
>>> shred any sensitive data that was formerly there to any  inspection
>>> by the subsequent owner of  that disk page ? ( zeroization )
>>
>> Got to thinking, are you talking about a physical machine or a VM/container on shared hosting? If the latter then it
isa more generic problem of detritus left behind between creations of virtual instances or cross talk on shared
storage.
>>
>>>
>>> Not sure that is the exact question to ask but hopefully you get a
>>> feel for the requirement is  not to  leave any sensitive data laying
>>> about for
>>>
>>> recovery by a hacker,  or at least minimize the places it could be
>>> obtained without actually being able to log into postgres or having
>>> raw disk access privileges.
>>>
>>> Thanks for any comments/instruction/links on the matter.
>>>
>>> Regards
>>>
>>> Dave Day
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>> In some instances this would be a vm instance on a hosted machine in other cases a actual physical machine.
>>
>> Thank you all for the feedback.
>>
>>
>> All good points.  I am not sure what the manner of attack/hack is until I get some further feedback out of the
meetingparticipants.  I suspect it would be to the blocks pages released by postgres following a vacuum full. 
>> How you determine what those pages blocks were I am not sure but suspect there is probably a way.
>> When I get some more detail on the standard and exact requirement I will repost with that info.
>
> Yes, a detailed problem description would be helpful.
>
>>
>>
>> Again thanks
>>
>>
>>
>> Dave Day
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
> Good Day,
>
> The specification that was under discussion was "collaborative Protection Profile for Network Devices"
> https://www.niap-ccevs.org/pp/cpp_nd_v1.0.pdf   Section 5.4.1.3 FCS_CKM.4 Crypotographic Key Destruction.
>
>
> The interpreted concern with respect to postgress includes  its released non-volatile and volatile memory.
>
> In the non-volatile case I believe this would amount to the vacuum -full scenario as commented on by David Johnston
andMelvin Davidson.  I have not looked at Tom Lanes source reference bufpage.c to see if that could be customized in
theshort term to sanitize the page being returned. 
>
> In the volatile memory case it would seem to include any memory used by postgres that might have included sensitive
dataand is at some point freed and therefore available in the general pool for allocation  and subject to  inspection
bythe next user. 
> Perhaps there is a common routine in the source that could perform such a job ?
>
> We are still digesting the specification and may find allowances given other security measures.

So what are you working on?

The document you link to starts with this:
"
Examples of network devices that are covered by requirements in this cPP include routers, firewalls, VPN gateways,
IDSs,and switches. ..." 

So embedded devices. Not sure how prevalent Postgres is in that area.

Also the subsection you refer to seems to be talking only about memory, not storage which is where VACUUM FULL works.
Thatmay be an overly fine distinction, but one that can be made. 


>
> Appreciate everyone's feedback.  This is perhaps a matter that can feed into future OS ( FreeBSD ) and/or Postgress
development.
>
>
> Regards
>
>
> Dave Day
>
>
>
>
>
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Adrian

Our app/development is a softswitch, (VoIP), and is considered a network appliance.
Postgres in general has been a joy to learn and aside from a a hiccup with plperl and FreeBSD (9.8)
that the discussion board helped me resolve some time ago, dependable and problem free.

Dave






On 11/19/2015 07:47 AM, Day, David wrote:
>

>
> So what are you working on?
>
> The document you link to starts with this:
> "
> Examples of network devices that are covered by requirements in this cPP include routers, firewalls, VPN gateways,
IDSs,and switches. ..." 
>
> So embedded devices. Not sure how prevalent Postgres is in that area.
>
> Also the subsection you refer to seems to be talking only about memory, not storage which is where VACUUM FULL works.
Thatmay be an overly fine distinction, but one that can be made. 
>
>
>>
>> Appreciate everyone's feedback.  This is perhaps a matter that can feed into future OS ( FreeBSD ) and/or Postgress
development.
>>
>>
>> Regards
>>
>>
>> Dave Day
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
> Adrian
>
> Our app/development is a softswitch, (VoIP), and is considered a network appliance.
> Postgres in general has been a joy to learn and aside from a a hiccup with plperl and FreeBSD (9.8)
> that the discussion board helped me resolve some time ago, dependable and problem free.

I scanned the subsection you referred to, and before acronym fatigue set
in, it seems to refer to in memory key handling during device
authentication. Is your Postgres instance doing that?

>
> Dave
>
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com



-----Original Message-----
From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
Sent: Thursday, November 19, 2015 11:06 AM
To: Day, David; pgsql-general@postgresql.org
Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.

On 11/19/2015 07:47 AM, Day, David wrote:
>

>
> So what are you working on?
>
> The document you link to starts with this:
> "
> Examples of network devices that are covered by requirements in this cPP include routers, firewalls, VPN gateways,
IDSs,and switches. ..." 
>
> So embedded devices. Not sure how prevalent Postgres is in that area.
>
> Also the subsection you refer to seems to be talking only about memory, not storage which is where VACUUM FULL works.
Thatmay be an overly fine distinction, but one that can be made. 
>
>
>>
>> Appreciate everyone's feedback.  This is perhaps a matter that can feed into future OS ( FreeBSD ) and/or Postgress
development.
>>
>>
>> Regards
>>
>>
>> Dave Day
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com
>
> Adrian
>
> Our app/development is a softswitch, (VoIP), and is considered a network appliance.
> Postgres in general has been a joy to learn and aside from a a hiccup
> with plperl and FreeBSD (9.8) that the discussion board helped me resolve some time ago, dependable and problem free.

I scanned the subsection you referred to, and before acronym fatigue set in, it seems to refer to in memory key
handlingduring device authentication. Is your Postgres instance doing that? 

>
> Dave
>
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Our app is doing the authentication based on the  sensitive information retrieved from postgres tables.
Our app zeros out its associated memory to the process when it is done with it. The developer was concerned about the
breadcrumbs left in postgress volatile memory in satisfying the query.




On 11/19/2015 08:50 AM, Day, David wrote:
>
>
> -----Original Message-----
> From: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
> Sent: Thursday, November 19, 2015 11:06 AM
> To: Day, David; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] postgres zeroization of dead tuples ? i.e scrubbing dead tuples with sensitive data.
>
> On 11/19/2015 07:47 AM, Day, David wrote:
>>
>
>>
>> So what are you working on?
>>
>> The document you link to starts with this:
>> "
>> Examples of network devices that are covered by requirements in this cPP include routers, firewalls, VPN gateways,
IDSs,and switches. ..." 
>>
>> So embedded devices. Not sure how prevalent Postgres is in that area.
>>
>> Also the subsection you refer to seems to be talking only about memory, not storage which is where VACUUM FULL
works.That may be an overly fine distinction, but one that can be made. 
>>
>>
>>>
>>> Appreciate everyone's feedback.  This is perhaps a matter that can feed into future OS ( FreeBSD ) and/or Postgress
development.
>>>
>>>
>>> Regards
>>>
>>>
>>> Dave Day
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>>
>> Adrian
>>
>> Our app/development is a softswitch, (VoIP), and is considered a network appliance.
>> Postgres in general has been a joy to learn and aside from a a hiccup
>> with plperl and FreeBSD (9.8) that the discussion board helped me resolve some time ago, dependable and problem
free.
>
> I scanned the subsection you referred to, and before acronym fatigue set in, it seems to refer to in memory key
handlingduring device authentication. Is your Postgres instance doing that? 
>
>>
>> Dave

FYI, I appreciate the bottom posts, just a heads up though that you
probably want to put your reply above my signature line. I had to pull
it up to get my email client to see it on reply.

 >Our app is doing the authentication based on the  sensitive
 >information retrieved from postgres tables.
 >Our app zeros out its associated memory to the process when it is done
 >with it. The developer was concerned about the
 >breadcrumbs left in postgress volatile memory in satisfying the query.


Well VACUUM is not going to help there, it works on the data stored on disk.

Might want to take a look at this page:

http://www.postgresql.org/docs/9.4/static/wal-configuration.html


--
Adrian Klaver
adrian.klaver@aklaver.com


On Thu, Nov 19, 2015 at 09:01:47AM -0600, Merlin Moncure wrote:

> It's quite a stretch to assume that HIPAA applies to internal garbage
> collection minutia.

It, of course, does.

Which is why applying your suggestion ...

> A much better way to look at compliance is to encrypt all sensitive
> details and, when the customer relationship is gone, delete the key.

... is necessary.

Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346


On 11/19/15 1:12 PM, Adrian Klaver wrote:
>  >Our app is doing the authentication based on the  sensitive
>  >information retrieved from postgres tables.
>  >Our app zeros out its associated memory to the process when it is done
>  >with it. The developer was concerned about the
>  >breadcrumbs left in postgress volatile memory in satisfying the query.
>
>
> Well VACUUM is not going to help there, it works on the data stored on
> disk.

Which would help from the standpoint of shared_buffers... for whatever
that's worth.

To answer an earlier comment about zeroing out the free space on the
page, it would be trivial to add that, at least for heap pages. Index
pages not so much, because you'd have to mess with every index type.

Also, if you're cranking the paranioa level to maximum, you'd want to
compile Postgres with the option that over-writes freed memory with 0x7f
too. That's meant to help find overruns and other memory access errors,
but would have the side effect of nuking the contents of freed memory.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com