Re: logical decoding and replication of sequences, take 2 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: logical decoding and replication of sequences, take 2 |
Date | |
Msg-id | CA+TgmoaSCu=bLtU7Zyth=h2Lgtzxv=xfRwLYt2ZtSKA=UN2SnQ@mail.gmail.com Whole thread Raw |
In response to | Re: logical decoding and replication of sequences, take 2 (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: logical decoding and replication of sequences, take 2
|
List | pgsql-hackers |
On Sun, Jan 28, 2024 at 1:07 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > Right, locks + apply in commit order gives us this guarantee (I can't > think of a case where it wouldn't be the case). I couldn't find any cases of inadequate locking other than the one I mentioned. > Doesn't the whole logical replication critically depend on the commit > order? If you decide to arbitrarily reorder/delay the transactions, all > kinds of really bad things can happen. That's a generic problem, it > applies to all kinds of objects, not just sequences - a parallel apply > would need to detect this sort of dependencies (e.g. INSERT + DELETE of > the same key), and do something about it. Yes, but here I'm not just talking about the commit order. I'm talking about the order of applying non-transactional operations relative to commits. Consider: T1: CREATE SEQUENCE s; T2: BEGIN; T2: SELECT nextval('s'); T3: SELECT nextval('s'); T2: ALTER SEQUENCE s INCREMENT 2; T2: SELECT nextval('s'); T2: COMMIT; The commit order is T1 < T3 < T2, but T3 makes no transactional changes, so the commit order is really just T1 < T2. But it's completely wrong to say that all we need to do is apply T1 before we apply T2. The correct order of application is: 1. T1. 2. T2's first nextval 3. T3's nextval 4. T2's transactional changes (i.e. the ALTER SEQUENCE INCREMENT and the subsequent nextval) In other words, the fact that some sequence changes are non-transactional creates ordering hazards that don't exist if there are no non-transactional changes. So in that way, sequences are different from table modifications, where applying the transactions in order of commit is all we need to do. Here we need to apply the transactions in order of commit and also apply the non-transactional changes at the right point in the sequence. Consider the following alternative apply sequence: 1. T1. 2. T2's transactional changes (i.e. the ALTER SEQUENCE INCREMENT and the subsequent nextval) 3. T3's nextval 4. T2's first nextval That's still in commit order. It's also wrong. Imagine that you commit this patch and someone later wants to do parallel logical apply. So every time they finish decoding a transaction, they stick it in a queue to be applied by the next available worker. But, non-transactional changes are very simple, so we just directly apply those in the main process. Well, kaboom! But now this can happen with the above example. 1. Decode T1. Add to queue for apply. 2. Before the (idle) apply worker has a chance to pull T1 out of the queue, decode the first nextval and try to apply it. Oops. We're trying to apply a modification to a sequence that hasn't been created yet. I'm not saying that this kind of hypothetical is a reason not to commit the patch. But it seems like we're not on the same page about what the ordering requirements are here. I'm just making the argument that those non-transactional operations actually act like mini-transactions. They need to happen at the right time relative to the real transactions. A non-transactional operation needs to be applied after any transactions that commit before it is logged, and before any transactions that commit after it's logged. > Yes, I think this is a bug in handling of owned sequences - from the > moment the "ALTER TABLE ... SET UNLOGGED" is executed, the two sessions > generate duplicate values (until the S1 is committed, at which point the > values generated in S2 get "forgotten"). > > It seems we end up updating both relfilenodes, which is clearly wrong. > > Seems like a bug independent of the decoding, IMO. Yeah. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: