Re: Proposal: Exploring LSM Tree‑Based Storage Engine for PostgreSQL (Inspired by MyRocks) - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Proposal: Exploring LSM Tree‑Based Storage Engine for PostgreSQL (Inspired by MyRocks)
Date
Msg-id CAEze2WiCeDXB=04=58=q9AwJqrxE+kHqLDdBwF3a6R2S6sgVGg@mail.gmail.com
Whole thread Raw
In response to Proposal: Exploring LSM Tree‑Based Storage Engine for PostgreSQL (Inspired by MyRocks)  (Manish Rai Jain <manishrjain@gmail.com>)
List pgsql-hackers
On Sun, 11 May 2025 at 14:06, Manish Rai Jain <manishrjain@gmail.com> wrote:
>
> Hi hackers,
>
> I’ve been exploring the idea of integrating an LSM tree–based storage engine into PostgreSQL — similar in spirit to
MyRocksfor MySQL — by replacing the underlying storage while preserving PostgreSQL’s upper layers (planner, executor,
MVCC,etc.). 
>
> The motivation stems from the well‑known write‑amplification issues with B‑trees under high write throughput. An
LSM‑basedengine could offer: 
>
> Significant improvements in write performance and space efficiency, especially under heavy ingestion workloads.
> Better scalability with larger datasets, particularly when compression is applied.
> Comparable read performance (with trade‑offs depending on workload), and opportunities to optimize through Bloom
filters,tiered compaction strategies, etc. 
> Reduced reliance on manual VACUUM: obsolete versions would be purged naturally during LSM compactions, potentially
eliminatingroutine heap vacuuming (transaction‑ID wrap‑around handling and stats collection would still need careful
design).
>
> The hoped‑for outcome is a faster, more scalable PostgreSQL for >1 TB workloads, while maintaining the rich feature
setand ecosystem compatibility users expect from Postgres. 
>
> Unlike Neon, this approach is not targeting cloud‑native object storage or remote WAL streaming, but instead
optimizingfor maximum performance on local disks or high‑performance block volumes, where write throughput and
compactionefficiency matter most. 

Note: Neon does not exactly structure anything PostgreSQL-visible in
an LSM structure. From Postgres' point of view, it's much more similar
to an LSM-based file system than an LSM-based TableAM or IndexAM.

> This would likely involve implementing a new Table Access Method (TAM), possibly backed by a forked engine such as
BadgerDBor RocksDB, adapted to support PostgreSQL’s MVCC and WAL semantics. 
>
> I’d love to hear your thoughts:
>
> Does this direction make sense for experimentation within the Postgres ecosystem?

IMV, PostgreSQL is uniquely positioned for experimentation with
different index data structures, while it has significant restrictions
on Table AM implementations through the Table AM API and our shared
buffers. Given the significant differences between the usual LSM
(arbitrarily large ranges of values) and block-based storage I'm not
sure how well Postgres would accomodate LSM-based storage. See also
below.

> Are there known architectural blockers or prior discussions/attempts in this space worth revisiting?

It depends on what you're planning to do. LSM-based TableAMs usually
are implemented with Index-organized data in mind. PostgreSQL,
however, expects its tables to have heap-like semantics for stored
tuples, with TID-addressible tuples that identify an unchanging set of
non-summarizingly indexed column values.

Additionally, PostgreSQL has its own buffer pool that is
block/page-based. You might be able to convince an LSM engine to use
block IO and page storage, but if that's not in your plans, you'll
probably also have to figure out how to do buffering and
checkpointing, and integrate with restartpoints.

> Would such a project be best developed entirely as a fork, or is there openness to evolving TAM to better support
pluggablestorage with LSM‑like semantics? 

I think the project is open to extensions of the TableAM to better
support new TableAMs, but only if that benefits the project as a
whole. E.g. if there are changes that impact the way things are going
to be indexed, you'll need to have a compelling story about why that's
needed and how indexes need to be adapted to support the new system.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



pgsql-hackers by date:

Previous
From: Andreas Karlsson
Date:
Subject: Re: INSERT ... ON CONFLICT DO SELECT [FOR ...] take 2
Next
From: Bruce Momjian
Date:
Subject: Re: Request for Implementation of Custom Error Messages for CHECK Constraints