Thread: how to securely delete the storage freed when a table is dropped?
For a system with information stored in a PostgreSQL 9.5 database, in which data stored in a table that is deleted must be securely deleted (like shred does to files), and where the system is persistent even though any particular table likely won't be (so can't just shred the disks at "completion"), I'm trying to figure out my options for securely deleting the underlying data files when a table is dropped.
As background, I'm not a DBA, but I am an experienced implementor in many languages, contexts, and databases. I've looked online and haven't been able to find a way to ask PostgreSQL to do the equivalent of shredding its underlying files before releasing them to the OS when a table is DROPped. Is there a built-in way to ask PostgreSQL to do this? (I might just not have searched for the right thing - my apologies if I missed something)
A partial answer we're looking at is shredding the underlying data files for a given relation and its indexes manually before dropping the tables, but this isn't so elegant, and I'm not sure it is getting all the information from the tables that we need to delete.
We also are looking at strategies for shredding free space on our data disk - either running a utility to do that, or periodically replicating the data volume, swapping in the results of the copy, then shredding the entire volume that was the source so its "free" space is securely overwritten in the process.
Are we missing something? Are there other options we haven't found? If we have to clean up manually, are there other places we need to go to shred data than the relation files for a given table, and all its related indexes, in the database's folder? Any help or advice will be greatly appreciated.
Thanks,
Jonathan Morgan
"The man with the new idea is a Crank until the idea succeeds."
- Mark Twain, from 'Following the Equator: A Journey Around the World'
- Mark Twain, from 'Following the Equator: A Journey Around the World'
On 04/13/2018 12:48 PM, Jonathan Morgan wrote: > For a system with information stored in a PostgreSQL 9.5 database, in > which data stored in a table that is deleted must be securely deleted > (like shred does to files), and where the system is persistent even though > any particular table likely won't be (so can't just shred the disks at > "completion"), I'm trying to figure out my options for securely deleting > the underlying data files when a table is dropped. > > As background, I'm not a DBA, but I am an experienced implementor in many > languages, contexts, and databases. I've looked online and haven't been > able to find a way to ask PostgreSQL to do the equivalent of shredding its > underlying files before releasing them to the OS when a table is DROPped. > Is there a built-in way to ask PostgreSQL to do this? (I might just not > have searched for the right thing - my apologies if I missed something) > > A partial answer we're looking at is shredding the underlying data files > for a given relation and its indexes manually before dropping the tables, > but this isn't so elegant, and I'm not sure it is getting all the > information from the tables that we need to delete. > > We also are looking at strategies for shredding free space on our data > disk - either running a utility to do that, or periodically replicating > the data volume, swapping in the results of the copy, then shredding the > entire volume that was the source so its "free" space is securely > overwritten in the process. > > Are we missing something? Are there other options we haven't found? If we > have to clean up manually, are there other places we need to go to shred > data than the relation files for a given table, and all its related > indexes, in the database's folder? Any help or advice will be greatly > appreciated. I'd write a program that fills all free space on disk with a specific pattern. You're probably using a logging filesystem, so that'll be far from perfect, though. -- Angular momentum makes the world go 'round.
> On Apr 13, 2018, at 10:48 AM, Jonathan Morgan <jonathan.morgan.007@gmail.com> wrote: > > For a system with information stored in a PostgreSQL 9.5 database, in which data stored in a table that is deleted mustbe securely deleted (like shred does to files), and where the system is persistent even though any particular table likelywon't be (so can't just shred the disks at "completion"), I'm trying to figure out my options for securely deletingthe underlying data files when a table is dropped. > > As background, I'm not a DBA, but I am an experienced implementor in many languages, contexts, and databases. I've lookedonline and haven't been able to find a way to ask PostgreSQL to do the equivalent of shredding its underlying filesbefore releasing them to the OS when a table is DROPped. Is there a built-in way to ask PostgreSQL to do this? (I mightjust not have searched for the right thing - my apologies if I missed something) > > A partial answer we're looking at is shredding the underlying data files for a given relation and its indexes manuallybefore dropping the tables, but this isn't so elegant, and I'm not sure it is getting all the information from thetables that we need to delete. > > We also are looking at strategies for shredding free space on our data disk - either running a utility to do that, or periodicallyreplicating the data volume, swapping in the results of the copy, then shredding the entire volume that was thesource so its "free" space is securely overwritten in the process. > > Are we missing something? Are there other options we haven't found? If we have to clean up manually, are there other placeswe need to go to shred data than the relation files for a given table, and all its related indexes, in the database'sfolder? Any help or advice will be greatly appreciated. Just "securely" deleting the files won't help much, as you'll leave data in spare space on the filesystem, in filesystemjournals and so on. Maybe put the transient tables an indexes in their own tablespace on their own filesystem, periodically move them to anothertablespace and wipe the first one's filesystem (either physically or forgetting the key for an encrypted FS)? That'dleave you with just the WAL data to deal with. Seems like a slightly odd requirement, though. What's your threat model? Cheers, Steve
There are free utilities that do government leave wipes. The process would be, drop the table, shrink the old table spacethen (if linux based), dd fill the drive, and use wipe, 5x or 8x deletion to make sure the drive does not have readableimprints on the platers. Now what Jonathan mentions - sounds like he wants to do the same to the physical table. Never dabbling into PSQL’s storageand optimization algorithms, I would first assume, a script to do a row by row update table set field1…fieldx, differentdata patterns, existing field value length and field max length. Run the script at least 5 to 8 times, then dropthe table .. the problem will be, does PSQL use a new page as you do this, then you are just playing with yourself. Letalone, how does PSQL handle indexes - new pages, or overwrite the existing page? And is any NPI (Non-Public-Info) datain the index itself? * So any PSQL core-engine guys reading? O. > On Apr 13, 2018, at 3:03 PM, Ron <ronljohnsonjr@gmail.com> wrote: > > > > On 04/13/2018 12:48 PM, Jonathan Morgan wrote: >> For a system with information stored in a PostgreSQL 9.5 database, in which data stored in a table that is deleted mustbe securely deleted (like shred does to files), and where the system is persistent even though any particular table likelywon't be (so can't just shred the disks at "completion"), I'm trying to figure out my options for securely deletingthe underlying data files when a table is dropped. >> >> As background, I'm not a DBA, but I am an experienced implementor in many languages, contexts, and databases. I've lookedonline and haven't been able to find a way to ask PostgreSQL to do the equivalent of shredding its underlying filesbefore releasing them to the OS when a table is DROPped. Is there a built-in way to ask PostgreSQL to do this? (I mightjust not have searched for the right thing - my apologies if I missed something) >> >> A partial answer we're looking at is shredding the underlying data files for a given relation and its indexes manuallybefore dropping the tables, but this isn't so elegant, and I'm not sure it is getting all the information from thetables that we need to delete. >> >> We also are looking at strategies for shredding free space on our data disk - either running a utility to do that, orperiodically replicating the data volume, swapping in the results of the copy, then shredding the entire volume that wasthe source so its "free" space is securely overwritten in the process. >> >> Are we missing something? Are there other options we haven't found? If we have to clean up manually, are there other placeswe need to go to shred data than the relation files for a given table, and all its related indexes, in the database'sfolder? Any help or advice will be greatly appreciated. > > I'd write a program that fills all free space on disk with a specific pattern. You're probably using a logging filesystem,so that'll be far from perfect, though. > > -- > Angular momentum makes the world go 'round. >
After you drop a table, aren't the associated files dropped? On 04/13/2018 02:29 PM, Ozz Nixon wrote: > There are free utilities that do government leave wipes. The process would be, drop the table, shrink the old table spacethen (if linux based), dd fill the drive, and use wipe, 5x or 8x deletion to make sure the drive does not have readableimprints on the platers. > > Now what Jonathan mentions - sounds like he wants to do the same to the physical table. Never dabbling into PSQL’s storageand optimization algorithms, I would first assume, a script to do a row by row update table set field1…fieldx, differentdata patterns, existing field value length and field max length. Run the script at least 5 to 8 times, then dropthe table .. the problem will be, does PSQL use a new page as you do this, then you are just playing with yourself. Letalone, how does PSQL handle indexes - new pages, or overwrite the existing page? And is any NPI (Non-Public-Info) datain the index itself? > > * So any PSQL core-engine guys reading? > > O. > >> On Apr 13, 2018, at 3:03 PM, Ron <ronljohnsonjr@gmail.com> wrote: >> >> >> >> On 04/13/2018 12:48 PM, Jonathan Morgan wrote: >>> For a system with information stored in a PostgreSQL 9.5 database, in which data stored in a table that is deleted mustbe securely deleted (like shred does to files), and where the system is persistent even though any particular table likelywon't be (so can't just shred the disks at "completion"), I'm trying to figure out my options for securely deletingthe underlying data files when a table is dropped. >>> >>> As background, I'm not a DBA, but I am an experienced implementor in many languages, contexts, and databases. I've lookedonline and haven't been able to find a way to ask PostgreSQL to do the equivalent of shredding its underlying filesbefore releasing them to the OS when a table is DROPped. Is there a built-in way to ask PostgreSQL to do this? (I mightjust not have searched for the right thing - my apologies if I missed something) >>> >>> A partial answer we're looking at is shredding the underlying data files for a given relation and its indexes manuallybefore dropping the tables, but this isn't so elegant, and I'm not sure it is getting all the information from thetables that we need to delete. >>> >>> We also are looking at strategies for shredding free space on our data disk - either running a utility to do that, orperiodically replicating the data volume, swapping in the results of the copy, then shredding the entire volume that wasthe source so its "free" space is securely overwritten in the process. >>> >>> Are we missing something? Are there other options we haven't found? If we have to clean up manually, are there otherplaces we need to go to shred data than the relation files for a given table, and all its related indexes, in the database'sfolder? Any help or advice will be greatly appreciated. >> I'd write a program that fills all free space on disk with a specific pattern. You're probably using a logging filesystem,so that'll be far from perfect, though. >> >> -- >> Angular momentum makes the world go 'round. >> -- Angular momentum makes the world go 'round.
On 13 April 2018 at 18:48, Jonathan Morgan <jonathan.morgan.007@gmail.com> wrote: > For a system with information stored in a PostgreSQL 9.5 database, in which > data stored in a table that is deleted must be securely deleted (like shred > does to files), and where the system is persistent even though any > particular table likely won't be (so can't just shred the disks at > "completion"), I'm trying to figure out my options for securely deleting the > underlying data files when a table is dropped. > > As background, I'm not a DBA, but I am an experienced implementor in many > languages, contexts, and databases. I've looked online and haven't been able > to find a way to ask PostgreSQL to do the equivalent of shredding its > underlying files before releasing them to the OS when a table is DROPped. Is > there a built-in way to ask PostgreSQL to do this? (I might just not have > searched for the right thing - my apologies if I missed something) > > A partial answer we're looking at is shredding the underlying data files for > a given relation and its indexes manually before dropping the tables, but > this isn't so elegant, and I'm not sure it is getting all the information > from the tables that we need to delete. > > We also are looking at strategies for shredding free space on our data disk > - either running a utility to do that, or periodically replicating the data > volume, swapping in the results of the copy, then shredding the entire > volume that was the source so its "free" space is securely overwritten in > the process. > > Are we missing something? Are there other options we haven't found? If we > have to clean up manually, are there other places we need to go to shred > data than the relation files for a given table, and all its related indexes, > in the database's folder? Any help or advice will be greatly appreciated. Can you encrypt the data in the application, above the DB level ? That would be cleaner if you can. If not, you'll have to worry about both the DB's data files themselves and the WAL files in pg_xlog/ which hold copies of the recently written data. Even if you securely scrub the deleted parts of the filesystems after dropping the table, there could still be copies of secret table data in WAL files that haven't yet been overwritten. One way to scrub deleted files would be to use ZFS and have an extra disk. When it's time to scrub, "zpool attach" the extra disk to your zpool, which will cause ZFS to copy over only the files that haven't been deleted, in the background. When that's finished you can detach the original disk from the zpool and then do a low-level overwrite of that entire disk. For extra security points use encrypted block devices underneath ZFS, and instead of scrubbing the disk just destroy the encryption key that you were using for it.