Re: Logical Replication - "invalid ordering of speculative insertion changes" - Mailing list pgsql-general

From Joe Wildish
Subject Re: Logical Replication - "invalid ordering of speculative insertion changes"
Date
Msg-id 8e285370-0c29-4a08-99f3-b8c0107eaf43@app.fastmail.com
Whole thread Raw
In response to Logical Replication - "invalid ordering of speculative insertion changes"  ("Joe Wildish" <joe@lateraljoin.com>)
Responses Re: Logical Replication - "invalid ordering of speculative insertion changes"  (Rahila Syed <rahilasyed90@gmail.com>)
List pgsql-general
Just a bump on this --- perhaps the error is a bug with the DBMS?

From what I can see "speculative insertion changes" in this context means INSERT..ON CONFLICT DML.  Although I have
someexperience writing extensions and simple patches for the code base, I don't know anything as a developer about the
transactionlog.  I dug around a bit and it appears that there is a specific record type inside the WAL that
differentiatesINSERT from INSERT..ON CONFLICT changes, and it is these changes that cannot be re-ordered when trying to
emitthe message to the output plugin for the logical replication slot (which in this case is the internal pgoutput
one). Given that, I don't see how it can be user error.  Unless anyone else knows differently?
 

-Joe

On Tue, 31 Jan 2023, at 18:10, Joe Wildish wrote:
> Hello,
>
> We have a logical replication publisher (13.7) and subscriber (14.6) 
> where we are seeing the following error on the subscriber. IP address 
> and publication name changed, otherwise verbatim:
>
> 2023-01-31 15:24:49 UTC:x.x.x.x(56276):super@pubdb:[1040971]: WARNING:  
> tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... 
> REFRESH PUBLICATION to subscribe the tables
> 2023-01-31 15:24:50 UTC::@:[1040975]: LOG:  logical replication apply 
> worker for subscription "pub" has started
> 2023-01-31 15:24:50 UTC::@:[1040975]: ERROR:  could not receive data 
> from WAL stream: ERROR:  invalid ordering of speculative insertion 
> changes
>
> This error occurs during the initial set up of the subscription.  We 
> hit REFRESH, and then immediately it goes into this error state. It 
> then repeats as it is retrying from here onwards and keeps hitting the 
> same error.
>
> My understanding is that the subscriber is performing some kind of 
> reordering of the events contained within the WAL message. As it cannot 
> then consume the message, it aborts, retries, and gets the same message 
> and errors again.  Looking in the source code it seems there is only 
> one place where this error can be emitted --- reorderbuffer.c:2179.  
> Moreover I can't tell if this is an error that I can be expected to 
> recover from as a user.
>
> We see this error only sometimes. Other times, we REFRESH the 
> subscription and it makes progress as one would expect.
>
> Can anyone advise on what we are doing wrong here?
>
> -Joe



pgsql-general by date:

Previous
From: veem v
Date:
Subject: Re: Sequence vs UUID
Next
From: "Peter J. Holzer"
Date:
Subject: Re: Sequence vs UUID