Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: logical decoding and replication of sequences, take 2 |
Date | |
Msg-id | 8d0d4b34-3e12-2637-e8b8-f4c3b2b0a1f3@enterprisedb.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences, take 2 (Amit Kapila <amit.kapila16@gmail.com>) |
List | pgsql-hackers |
On 11/27/23 12:11, Amit Kapila wrote: > On Mon, Nov 27, 2023 at 4:17 PM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 11/27/23 11:13, Amit Kapila wrote: >>> On Mon, Nov 27, 2023 at 11:34 AM Amit Kapila <amit.kapila16@gmail.com> wrote: >>>> >>>> On Mon, Nov 27, 2023 at 6:41 AM Tomas Vondra >>>> <tomas.vondra@enterprisedb.com> wrote: >>>>> >>>>> While going over 0001, I realized there might be an optimization for >>>>> ReorderBufferSequenceIsTransactional. As coded in 0001, it always >>>>> searches through all top-level transactions, and if there's many of them >>>>> that might be expensive, even if very few of them have any relfilenodes >>>>> in the hash table. It's still linear search, and it needs to happen for >>>>> each sequence change. >>>>> >>>>> But can the relfilenode even be in some other top-level transaction? How >>>>> could it be - our transaction would not see it, and wouldn't be able to >>>>> generate the sequence change. So we should be able to simply check *our* >>>>> transaction (or if it's a subxact, the top-level transaction). Either >>>>> it's there (and it's transactional change), or not (and then it's >>>>> non-transactional change). >>>>> >>>> >>>> I also think the relfilenode should be part of either the current >>>> top-level xact or one of its subxact, so looking at all the top-level >>>> transactions for each change doesn't seem advisable. >>>> >>>>> The 0004 does this. >>>>> >>>>> This of course hinges on when exactly the transactions get created, and >>>>> assignments processed. For example if this would fire before the txn >>>>> gets assigned to the top-level one, this would break. I don't think this >>>>> can happen thanks to the immediate logging of assignments, but I'm too >>>>> tired to think about it now. >>>>> >>>> >>>> This needs some thought because I think we can't guarantee the >>>> association till we reach the point where we can actually decode the >>>> xact. See comments in AssertTXNLsnOrder() [1]. >>>> >> >> I suppose you mean the comment before the SnapBuildXactNeedsSkip call, >> which says: >> >> /* >> * Skip the verification if we don't reach the LSN at which we start >> * decoding the contents of transactions yet because until we reach >> * the LSN, we could have transactions that don't have the association >> * between the top-level transaction and subtransaction yet and >> * consequently have the same LSN. We don't guarantee this >> * association until we try to decode the actual contents of >> * transaction. The ordering of the records prior to the >> * start_decoding_at LSN should have been checked before the restart. >> */ >> >> But doesn't this say that after we actually start decoding / stop >> skipping, we should have seen the assignment? We're already decoding >> transaction contents (because sequence change *is* part of xact, even if >> we decide to replay it in the non-transactional way). >> > > It means to say that the assignment is decided after start_decoding_at > point. We haven't decided that we are past start_decoding_at by the > time the patch is computing the transactional flag. > Ah, I see. We're deciding if the change is transactional before calling SnapBuildXactNeedsSkip. That's a bit unfortunate. >>> >>> I am wondering that instead of building the infrastructure to know >>> whether a particular change is transactional on the decoding side, >>> can't we have some flag in the WAL record to note whether the change >>> is transactional or not? I have discussed this point with my colleague >>> Kuroda-San and we thought that it may be worth exploring whether we >>> can use rd_createSubid/rd_newRelfilelocatorSubid in RelationData to >>> determine if the sequence is created/changed in the current >>> subtransaction and then record that in WAL record. By this, we need to >>> have additional information in the WAL record like XLOG_SEQ_LOG but we >>> can probably do it only with wal_level as logical. >>> >> >> I may not understand the proposal exactly, but it's not enough to know >> if it was created in the same subxact. It might have been created in >> some earlier subxact in the same top-level xact. >> > > We should be able to detect even some earlier subxact or top-level > xact based on rd_createSubid/rd_newRelfilelocatorSubid. > Interesting. I admit I haven't considered using these fields before, so I need to familiarize with it a bit, and try if it'd work. >> FWIW I think one of the earlier patch versions did something like this, >> by adding a "created" flag in the xlog record. And we concluded doing >> this on the decoding side is a better solution. >> > > oh, I thought it would be much simpler than what we are doing on the > decoding-side. Can you please point me to the email discussion where > this is concluded or share the reason? > I think the discussion started around [1], and then in a bunch of following messages (search for "relfilenode"). regards [1] https://www.postgresql.org/message-id/CAExHW5v_vVqkhF4ehST9EzpX1L3bemD1S%2BkTk_-ZVu_ir-nKDw%40mail.gmail.com -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: