Re: Logical decoding for operations on zheap tables - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Logical decoding for operations on zheap tables |
Date | |
Msg-id | CAA4eK1JaY3c6WFJoWYj0Vawew3g2o=iqmWChSk54FvhKReo9Og@mail.gmail.com Whole thread Raw |
In response to | Re: Logical decoding for operations on zheap tables (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Logical decoding for operations on zheap tables
|
List | pgsql-hackers |
On Thu, Jan 3, 2019 at 11:30 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2018-12-31 09:56:48 +0530, Amit Kapila wrote: > > To support logical decoding for zheap operations, we need a way to > > ensure zheap tuples can be registered as change streams. One idea > > could be that we make ReorderBufferChange aware of another kind of > > tuples as well, something like this: > > > > @@ -100,6 +123,20 @@ typedef struct ReorderBufferChange > > ReorderBufferTupleBuf *newtuple; > > } tp; > > + struct > > + { > > + /* relation that has been changed */ > > + RelFileNode relnode; > > + > > + /* no previously reassembled toast chunks are necessary anymore */ > > + bool clear_toast_afterwards; > > + > > + /* valid for DELETE || UPDATE */ > > + ReorderBufferZTupleBuf *oldtuple; > > + /* valid for INSERT || UPDATE */ > > + ReorderBufferZTupleBuf *newtuple; > > + } ztp; > > + > > > > > > +/* an individual zheap tuple, stored in one chunk of memory */ > > +typedef struct ReorderBufferZTupleBuf > > +{ > > .. > > + /* tuple header, the interesting bit for users of logical decoding */ > > + ZHeapTupleData tuple; > > .. > > +} ReorderBufferZTupleBuf; > > > > Apart from this, we need to define different decode functions for > > zheap operations as the WAL data is different for heap and zheap, so > > same functions can't be used to decode. > > I'm very strongly opposed to that. We shouldn't have expose every > possible storage method to output plugins, that'll make extensibility > a farce. I think we'll either have to re-form a HeapTuple or decide > to bite the bullet and start exposing tuples via slots. > To be clear, you are against exposing different format of tuples to plugins, not having different decoding routines for other storage engines, because later part is unavoidable due to WAL format. Now, about tuple format, I guess it would be a lot better if we expose via slots, but won't that make existing plugins to change the way they decode the tuple, maybe that is okay? OTOH, re-forming the heap tuple has a cost which might be okay for the time being or first version, but eventually, we want to avoid that. The other reason why I refrained from tuple conversion was that I was not sure if we anywhere rely on the transaction information in the tuple during decode process, because that will be tricky to mimic, but I guess we don't check that. The only point for exposing a different tuple format via plugin was a performance which I think can be addressed if we expose via slots. I don't want to take up exposing slots instead of tuples for plugins as part of this project and I think if we want to go with that, it is better done as part of pluggable API? > > > This email is primarily to discuss about how the logical decoding for > > basic DML operations (Insert/Update/Delete) will work in zheap. We > > might need some special mechanism to deal with sub-transactions as > > zheap doesn't generate a transaction id for sub-transactions, but we > > can discuss that separately. > > Subtransactions seems to be the hardest part besides the tuple format > issue, so I think we should discuss that very soon. > Agreed, I am going to look at that part next. > > > /* > > * Write relation description to the output stream. > > */ > > diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c > > index 23466bade2..70fb5e2934 100644 > > --- a/src/backend/replication/logical/reorderbuffer.c > > +++ b/src/backend/replication/logical/reorderbuffer.c > > @@ -393,6 +393,19 @@ ReorderBufferReturnChange(ReorderBuffer *rb, ReorderBufferChange *change) > > change->data.tp.oldtuple = NULL; > > } > > break; > > + case REORDER_BUFFER_CHANGE_ZINSERT: > > This really needs to be undistinguishable from normal CHANGE_INSERT... > Sure, it will be if we decide to either re-form heap tuple or expose via slots. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: