Greetings,
* asaba.takanori@fujitsu.com (asaba.takanori@fujitsu.com) wrote:
> This feature erases data area just before it is returned to the OS (“erase” means that overwrite data area to hide
itscontents here)
> because there is a risk that the data will be restored by attackers if it is returned to the OS without being
overwritten.
> The erase timing is when DROP, VACUUM, TRUNCATE, etc. are executed.
Looking at this fresh, I wanted to point out that I think Tom's right-
we aren't going to be able to reasonbly support this kind of data
erasure on a simple DROP TABLE or TRUNCATE.
> I want users to be able to customize the erasure method for their security policies.
There's also this- but I think what it means is that we'd probably have
a top-level command that basically is "ERASE TABLE blah;" or similar
which doesn't operate during transaction commit but instead marks the
table as "to be erased" and then perhaps "erasure in progress" and then
"fully erased" (or maybe just back to 'normal' at that point). Making
those updates will require the command to perform its own transaction
management which is why it can't be in a transaction itself but also
means that the data erasure process doesn't need to be done during
commit.
> My idea is adding a new parameter erase_command to postgresql.conf.
Yeah, I don't think that's really a sensible option or even approach.
> When erase_command is set, VACUUM does not truncate a file size to non-zero
> because it's safer for users to return the entire file to the OS than to return part of it.
There was discussion elsewhere about preventing VACUUM from doing a
truncate on a file because of the lock it requires and problems with
replicas.. I'm not sure where that ended up, but, in general, I don't
think this feature and VACUUM should really have anything to do with
each other except for the possible case that a user might be told to
configure their system to not allow VACUUM to truncate tables if they
care about this case.
As mentioned elsewhere, you do also have to consider that the sensitive
data will end up in the WAL and on replicas. I don't believe that means
this feature is without use, but it means that users of this feature
will also need to understand and be able to address WAL and replicas
(along with backups and such too, of course).
Thanks,
Stephen