Thread: Proposal for Integrating Data Masking and anonymization into PostgreSQL

Dear PostgreSQL Development Team,

I am writing to propose the development of native data masking and anonymization features within PostgreSQL. As a long-time user of PostgreSQL, 

I have observed a growing need for efficient and secure data handling, particularly in compliance with regulations like GDPR.


By integrating data masking and anonymization directly into the database engine, we can achieve several benefits:

  • Improved Performance: Native implementation can significantly enhance performance compared to external tools.
  • Seamless Integration: A built-in solution would seamlessly integrate with existing PostgreSQL workflows and tools.
  • Data Format Preservation: The database engine can ensure that masked or anonymized data adheres to the original data format.
  • DB Metadata Awareness: PostgreSQL's metadata can be leveraged to tailor masking and anonymization strategies to specific data types and constraints.

I envision these features taking the form of:

  • New Functions or Operators: For direct application within SQL queries.
  • Configuration Options: To allow users to customize masking and anonymization behavior.
  • Plugins or Extensions: For additional capabilities.

I would be happy to discuss this proposal further and provide more details on specific use cases and requirements. Thank you for your time and consideration.


Re: Proposal for Integrating Data Masking and anonymization into PostgreSQL

From
Alastair Turner
Date:
Hi Hosney

On Wed, 23 Oct 2024, 15:17 Hosney Osman, <hosneybinosman@gmail.com> wrote:

Dear PostgreSQL Development Team,

I am writing to propose the development of native data masking and anonymization features within PostgreSQL. As a long-time user of PostgreSQL, 

I have observed a growing need for efficient and secure data handling, particularly in compliance with regulations like GDPR

This requirement already seems to be well served by extensions. pg_anonymize, for instance, uses the meta data tools available in the database, like security labels, to manage the values returned to users.

Is there any functionality, or usability, in this area which you believe can't be delivered by an extension? 

Regards 

Alastair 
could you please provide me with reference or article or example for this scenario
for example i have schema include 4 table
and i want to anonymize it in another schema
how to do that

On Wed, Oct 23, 2024 at 4:25 PM Alastair Turner <minion@decodable.me> wrote:
Hi Hosney

On Wed, 23 Oct 2024, 15:17 Hosney Osman, <hosneybinosman@gmail.com> wrote:

Dear PostgreSQL Development Team,

I am writing to propose the development of native data masking and anonymization features within PostgreSQL. As a long-time user of PostgreSQL, 

I have observed a growing need for efficient and secure data handling, particularly in compliance with regulations like GDPR

This requirement already seems to be well served by extensions. pg_anonymize, for instance, uses the meta data tools available in the database, like security labels, to manage the values returned to users.

Is there any functionality, or usability, in this area which you believe can't be delivered by an extension? 

Regards 

Alastair 
Hi Osman

See postgresql_anonymizer
https://gitlab.com/dalibo/postgresql_anonymizer/ and pg_anonymize
https://github.com/rjuju/pg_anonymize , where the former has more
functionality, like synthetic data generation (if you need that).

Yours, Stefan

Am Di., 29. Okt. 2024 um 10:01 Uhr schrieb Hosney Osman
<hosneybinosman@gmail.com>:
>
> could you please provide me with reference or article or example for this scenario
> for example i have schema include 4 table
> and i want to anonymize it in another schema
> how to do that
>
> On Wed, Oct 23, 2024 at 4:25 PM Alastair Turner <minion@decodable.me> wrote:
>>
>> Hi Hosney
>>
>> On Wed, 23 Oct 2024, 15:17 Hosney Osman, <hosneybinosman@gmail.com> wrote:
>>>
>>> Dear PostgreSQL Development Team,
>>>
>>> I am writing to propose the development of native data masking and anonymization features within PostgreSQL. As a
long-timeuser of PostgreSQL, 
>>>
>>> I have observed a growing need for efficient and secure data handling, particularly in compliance with regulations
likeGDPR 
>>
>> This requirement already seems to be well served by extensions. pg_anonymize, for instance, uses the meta data tools
availablein the database, like security labels, to manage the values returned to users. 
>>
>> Is there any functionality, or usability, in this area which you believe can't be delivered by an extension?
>>
>> Regards
>>
>> Alastair