Re: Errors with schema migration and logical replication — expected? - Mailing list pgsql-general

From Mike Lissner
Subject Re: Errors with schema migration and logical replication — expected?
Date
Msg-id CAMp9=EzTG4Zv-PQCntsd6GBuCqr5SyMyX=qR84KvMesWQh4ruw@mail.gmail.com
Whole thread Raw
In response to Re: Errors with schema migration and logical replication — expected?  (Mike Lissner <mlissner@michaeljaylissner.com>)
Responses Re: Errors with schema migration and logical replication — expected?  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general
Reupping this since it was over the weekend and looks like a bug in logical replication. My problems are solved, but some very weird things happened when doing a schema migration.

On Sun, Dec 9, 2018 at 5:48 PM Mike Lissner <mlissner@michaeljaylissner.com> wrote:
On Sun, Dec 9, 2018 at 12:42 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:

1) Using psql have you verified that NOT NULL is set on that column on
the publisher?

Yes, on the publisher and the subscriber. That was my first step when I saw the log lines about this.

2) And that the row that failed in the subscriber is in the publisher table.

Yep, it's there (though it doesn't show a null for that column, and I don't know how it ever could have).
 
3) That there are no NULL values in the publisher column?

This on the publisher:

select * from search_docketentry where recap_sequence_number is null;

returns zero rows, so yeah, no nulls in there (which makes sense since they're not allowed).

Whatever the answers to 1), 2) and 3) are the next question is:

4) Do you want/need recap_sequence_number to be NOT NULL.

Yes, and indeed that's how it always has been.

a) If not then you could leave things as they are.

Well, I was able to fix this by briefly allowing nulls on the subscriber, letting it catch up with the publisher, setting all nulls to empty strings (a Django convention), and then disallowing nulls again. After letting it catch up, there were 118 nulls on the subscriber in this column:


That shouldn't be possible since nulls were never allowed in this column on the publisher.
 
b) If so then you:

        1) Have to figure out what is sending NULL values to the column.

            Maybe a model that has null=True set when it shouldn't be?

Nope, never had that. I'm 100% certain.
 
            A Form/ModelForm that is allowing None/Null?

Even if that was the case, the error wouldn't have shown up on the subscriber since that null would have never been allowed in the publisher. But anyway, I don't use any forms with this column.
 
            Some code that is operating outside the ORM e.g. doing a
           direct query using from django.db import connection.

That's an idea, but like I said, nothing sends SQL to the subscriber (not even read requests), and this shouldn't have been possible in the publisher due to the NOT NULL constraint that has *always* been on that column.

         2) Clean up the NULL values in the column in the subscriber
            and/or publisher.

There were only NULL values in the subscriber, never in the publisher. Something is amiss here.

I appreciate all the responses. I'm scared to say so, but I think this is a bug in logical replication. Somehow a null value appeared at the subscriber that was never in the publisher.

I also still have this question/suggestion from my first email:

> Is the process for schema migrations documented somewhere beyond the above?

Thank you again,

Mike

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Search path & functions in temporary schemas
Next
From: Adrian Klaver
Date:
Subject: Re: Errors with schema migration and logical replication — expected?