Re: RFC: PostgreSQL Storage I/O Transformation Hooks - Mailing list pgsql-hackers
| From | Konstantin Knizhnik |
|---|---|
| Subject | Re: RFC: PostgreSQL Storage I/O Transformation Hooks |
| Date | |
| Msg-id | 6769b9cf-fd45-4206-bb10-810e023889ea@garret.ru Whole thread Raw |
| In response to | Re: RFC: PostgreSQL Storage I/O Transformation Hooks (Henson Choi <assam258@gmail.com>) |
| Responses |
Re: RFC: PostgreSQL Storage I/O Transformation Hooks
|
| List | pgsql-hackers |
On 28/12/2025 4:53 PM, Henson Choi wrote: > Subject: Re: RFC: PostgreSQL Storage I/O Transformation Hooks > > Hi Konstantin, > > I have great respect for the work being done on the extensible SMGR API. > It is a powerful tool for use cases that require replacing the entire > storage layer (like Neon's architecture). > > However, I believe we should distinguish between Storage Management > (where/how data is stored) and Data Transformation (what the data looks > like). I see a strong case for both approaches to coexist for the > following practical reasons: > > 1. Separation of Concerns and Safety > > Is it reasonable to ask cryptography experts to clone the entire SMGR > implementation and maintain code they don't fully understand just to > insert encryption logic? If an extension developer clones md.c to add > encryption, they become responsible for the fundamental integrity of > PostgreSQL's file I/O. Any bug in their cloned storage logic could lead > to data loss unrelated to encryption itself. > > 2. The Maintenance Debt of "Cloning" > > When md.c receives critical security patches or bug fixes in the core, > every TDE extension maintainer would need to manually backport those > changes to their specific SMGR implementation. This creates a fragmented > ecosystem where security extensions might actually introduce storage > vulnerabilities by running outdated cloned logic. > > 3. Minimalist Integration > > The hook approach allows crypto experts to focus strictly on transform() > and reverse_transform(). The complex storage orchestration remains with > the PostgreSQL core where it is most rigorously tested. This is a cleaner > separation of responsibilities: the core provides the trusted pipeline, > and the extension provides the specialized transformation. > > Conclusion: > > I believe these hooks provide a "low-barrier, high-safety" path for data > transformation that the SMGR API—by its very nature of being a full > replacement—cannot easily provide. Let's provide the SMGR for those who > want to reinvent the storage, and hooks for those who simply want to > secure the data. > > Best regards, > Henson Choi I do not think that custom SMGR API contradicts to the idea of Data Transformation. Do you know about decorator pattern? If you want to implement i.e. data encryption, you definitely do not need to write your storage manager from the scratch. Obviously you can (and should) use standard storage manager (md.c) for actually performing IO. But your storage manager can perform some extra action prior of after IO, for example encrypt data before write and decrypt it after read. So any pre/post/instead hooks can be easily implemented using custom SMGR. Opposite unfortunately is not possible. You can not for example implement encryption+compression using hooks. But you can easily do it using custom SMGR: this is how compressed file system (CFS) was implemented in PgPro.
pgsql-hackers by date: