RE: Time delayed LR (WAS Re: logical replication restrictions) - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: Time delayed LR (WAS Re: logical replication restrictions)
Date
Msg-id TYAPR01MB5866ECFC52FAED493007A2CEF5C99@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Time delayed LR (WAS Re: logical replication restrictions)  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Time delayed LR (WAS Re: logical replication restrictions)
List pgsql-hackers
Dear Amit, Horiguchi-san,

> >
> > send_feedback():
> > +        * If the subscriber side apply is delayed (because of time-delayed
> > +        * replication) then do not tell the publisher that the received latest
> > +        * LSN is already applied and flushed, otherwise, it leads to the
> > +        * publisher side making a wrong assumption of logical replication
> > +        * progress. Instead, we just send a feedback message to avoid a
> publisher
> > +        * timeout during the delay.
> >          */
> > -       if (!have_pending_txes)
> > +       if (!have_pending_txes && !in_delayed_apply)
> >                 flushpos = writepos = recvpos;
> >
> > Honestly I don't like this wart. The reason for this is the function
> > assumes recvpos = applypos but we actually call it while holding
> > unapplied changes, that is, applypos < recvpos.
> >
> > Couldn't we maintain an additional static variable "last_applied"
> > along with last_received?
> >
> 
> It won't be easy to maintain the meaning of last_applied because there
> are cases where we don't apply the change directly. For example, in
> case of streaming xacts, we will just keep writing it to the file,
> now, say, due to some reason, we have to send the feedback, then it
> will not allow you to update the latest write locations. This would
> then become different then what we are doing without the patch.
> Another point to think about is that we also need to keep the variable
> updated for keep-alive ('k') messages even though we don't apply
> anything in that case. Still, other cases to consider are where we
> have mix of streaming and non-streaming transactions.

I have tried to implement that, but it might be difficult because of a corner
case related with the initial data sync.

First of all, I have made last_applied to update when

* transactions are committed, prepared, or aborted
* apply worker receives keepalive message.

I thought during the initial data sync, we must not update the last applied
triggered by keepalive messages, so following lines were added just after
updating last_received.

```
+                                               if (last_applied < end_lsn && AllTablesyncsReady())
+                                                       last_applied = end_lsn;
```

However, if data is synchronizing and workers receive the non-committable WAL,
this condition cannot be satisfied. 009_matviews.pl tests such a case, and I
got a failure there. In this test MATERIALIZED VIEW is created on publisher and then
the WAL is replicated to subscriber, but the transaction is not committed because
logical replication does not support the statement.
If we change the condition, we may the system may become inconsistent because the
worker replies that all remote WALs are applied even if tablesync workers are
synchronizing data.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


pgsql-hackers by date:

Previous
From: Hans Buschmann
Date:
Subject: Re: Making Vars outer-join aware
Next
From: Dean Rasheed
Date:
Subject: Re: Non-decimal integer literals