Re: [HACKERS] Pluggable storage - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] Pluggable storage
Date
Msg-id CAPpHfdsi+xSNiokfghK9f0OxdPV4HXf2N1iZ0qPE-Sx_Oxq6AA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Pluggable storage  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, Oct 9, 2017 at 5:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alexander Korotkov <a.korotkov@postgrespro.ru> writes:
> For me, it's crucial point that pluggable storages should be able to have
> different MVCC implementation, and correspondingly have full control over
> its interactions with indexes.
> Thus, it would be good if we would get consensus on that point.  I'd like
> other discussion participants to comment whether they agree/disagree and
> why.
> Any comments?

TBH, I think that's a good way of ensuring that nothing will ever get
committed.  You're trying to draw the storage layer boundary at a point
that will take in most of the system.  If we did build it like that,
what we'd end up with would be very reminiscent of mysql's storage
engines, complete with inconsistent behaviors and varying feature sets
across engines.  I don't much want to go there.

However, if we insist that pluggable storage should have the same MVCC implementation, interacts with indexes the same way and also use TIDs as tuple identifiers, then what useful implementations might we have?  Per-page heap compression and encryption?  Or different heap page layout? Or tuple format?  OK, but that doesn't justify such wide API as it's implemented in the current version of patch in this thread.  If we really want to restrict applicability of pluggable storages that way, then we probably should give up with "pluggable storages" and make it "pluggable heap page format" at I proposed upthread.

Implementation of alternative storage would be hard and challenging task.  Yes, it would include reimplementation of significant part of the system.  But that seems inevitable if we're going to implement alternative really storages (not just hacks over existing storage).  And I don't think that our pluggable storages would be reminiscent of mysql's storage engines while we're keeping two properties:
1) All the storages use the same WAL stream,
2) All the storages use same transactions and snapshots.
If we keep these two properties, we wouldn't need neither 2PC to run transactions across different storages, neither separate log for replication.  These two are major drawbacks of MySQL model.
Varying feature sets across engines seems inevitable and natural.  We've to invent alternative storages to have features whose are hard to have in our current storage.  So, no wonder that feature sets would be varying...

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: "Bossart, Nathan"
Date:
Subject: Re: [HACKERS] Additional logging for VACUUM and ANALYZE
Next
From: Andrey Borodin
Date:
Subject: Re: [HACKERS] On markers of changed data