Re: [HACKERS] Help required to debug pg_repack breaking logical replication - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: [HACKERS] Help required to debug pg_repack breaking logical replication
Date
Msg-id CAMsr+YFt_J6BpiYEv6Ri90h6mLGZpada5zZ0kG6O+HbN2zUEcg@mail.gmail.com
Whole thread Raw
In response to [HACKERS] Help required to debug pg_repack breaking logical replication  (Daniele Varrazzo <daniele.varrazzo@gmail.com>)
Responses Re: [HACKERS] Help required to debug pg_repack breaking logicalreplication
List pgsql-hackers
On 8 October 2017 at 02:37, Daniele Varrazzo <daniele.varrazzo@gmail.com> wrote:
> Hello,
>
> we have been reported, and I have experienced a couple of times,
> pg_repack breaking logical replication.
>
> - https://github.com/reorg/pg_repack/issues/135
> - https://github.com/2ndQuadrant/pglogical/issues/113

Yeah, I was going to say I've seen reports of this with pglogical, but
I see you've linked to them.

I haven't had a chance to look into it though, and haven't had a
suitable reproducible test case.

> In the above issue #113, Petr Jelinek commented:
>
>> From quick look at pg_repack, the way it does table rewrite is almost guaranteed
>> to break logical decoding unless there is zero unconsumed changes for a given table
>> as it does not build the necessary mappings info for logical decoding that standard
>> heap rewrite in postgres does.
>
> unfortunately he didn't follow up to further details requests.

At a guess he's referring to src/backend/access/heap/rewriteheap.c .

I'd explain better if I understood what was going on myself, but I
haven't really understood the logical decoding parts of that code.

> - Is Petr diagnosis right and freezing of logical replication is to be
> blamed to missing mapping?
> - Can you suggest a test to reproduce the issue reliably?
> - What are mapped relations anyway?

I can't immediately give you the answers you seek, but start by
studying src/backend/access/heap/rewriteheap.c . Notably
logical_end_heap_rewrite, logical_rewrite_heap_tuple,
logical_begin_heap_rewrite.

At a wild "I haven't read any of the relevant code in detail yet" stab
in the dark, pg_repack is failing to do the bookkeeping required by
logical decoding around relfilenode changes, cmin/cmax, etc.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [HACKERS] Slow synchronous logical replication
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] On markers of changed data