Re: RFC: PostgreSQL Storage I/O Transformation Hooks - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: RFC: PostgreSQL Storage I/O Transformation Hooks
Date
Msg-id 6769b9cf-fd45-4206-bb10-810e023889ea@garret.ru
Whole thread Raw
In response to Re: RFC: PostgreSQL Storage I/O Transformation Hooks  (Henson Choi <assam258@gmail.com>)
Responses Re: RFC: PostgreSQL Storage I/O Transformation Hooks
List pgsql-hackers
On 28/12/2025 4:53 PM, Henson Choi wrote:
> Subject: Re: RFC: PostgreSQL Storage I/O Transformation Hooks
>
> Hi Konstantin,
>
> I have great respect for the work being done on the extensible SMGR API.
> It is a powerful tool for use cases that require replacing the entire
> storage layer (like Neon's architecture).
>
> However, I believe we should distinguish between Storage Management
> (where/how data is stored) and Data Transformation (what the data looks
> like). I see a strong case for both approaches to coexist for the
> following practical reasons:
>
> 1. Separation of Concerns and Safety
>
> Is it reasonable to ask cryptography experts to clone the entire SMGR
> implementation and maintain code they don't fully understand just to
> insert encryption logic? If an extension developer clones md.c to add
> encryption, they become responsible for the fundamental integrity of
> PostgreSQL's file I/O. Any bug in their cloned storage logic could lead
> to data loss unrelated to encryption itself.
>
> 2. The Maintenance Debt of "Cloning"
>
> When md.c receives critical security patches or bug fixes in the core,
> every TDE extension maintainer would need to manually backport those
> changes to their specific SMGR implementation. This creates a fragmented
> ecosystem where security extensions might actually introduce storage
> vulnerabilities by running outdated cloned logic.
>
> 3. Minimalist Integration
>
> The hook approach allows crypto experts to focus strictly on transform()
> and reverse_transform(). The complex storage orchestration remains with
> the PostgreSQL core where it is most rigorously tested. This is a cleaner
> separation of responsibilities: the core provides the trusted pipeline,
> and the extension provides the specialized transformation.
>
> Conclusion:
>
> I believe these hooks provide a "low-barrier, high-safety" path for data
> transformation that the SMGR API—by its very nature of being a full
> replacement—cannot easily provide. Let's provide the SMGR for those who
> want to reinvent the storage, and hooks for those who simply want to
> secure the data.
>
> Best regards,
> Henson Choi


I do not think that custom SMGR API contradicts to the idea of Data 
Transformation.
Do you know about decorator pattern?
If you want to implement i.e. data encryption, you definitely do not 
need to write your storage manager from the scratch.
Obviously you can (and should)  use standard storage manager (md.c) for 
actually performing IO.
But your storage manager can perform some extra action prior of after 
IO, for example encrypt data before write and decrypt it after read.
So any pre/post/instead hooks can be easily implemented using custom SMGR.


Opposite unfortunately is not possible. You can not for example 
implement encryption+compression using hooks.
But you can easily do it using custom SMGR: this is how compressed file 
system (CFS) was implemented in PgPro.




pgsql-hackers by date:

Previous
From: Marcos Pegoraro
Date:
Subject: Re: Get rid of "Section.N.N.N" on DOCs
Next
From: Marcos Pegoraro
Date:
Subject: Re: Get rid of "Section.N.N.N" on DOCs