Re: Proposal: Conflict log history table for Logical Replication - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Proposal: Conflict log history table for Logical Replication
Date
Msg-id CAFiTN-s3ZFHteQsiC3H4=AjTWxuwN-w69XQ3xL5X6YOMTua4pA@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Conflict log history table for Logical Replication  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Tue, Dec 2, 2025 at 4:45 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 2, 2025 at 2:47 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Dec 2, 2025 at 12:38 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Dec 2, 2025 at 12:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > >
> > > >
> > > > Okay, I will try to make multiple local rows as JSON Array in the next version.
> > > >
> > > Just to clarify so that we are on the same page, along with the local
> > > tuple the other local fields like local_xid, local_commit_ts,
> > > local_origin will also be converted into the array.  Hope that makes
> > > sense?
> > >
> >
> > Yes, what about key_tuple or RI?
> >
> > > So we will change the table like this, not sure if this makes sense to
> > > keep all local array fields nearby in the table, or let it be near the
> > > respective remote field, like we are doing now remote_xid and local
> > > xid together etc.
> > >
> >
> > It is better to keep the array fields together at the end. I think it
> > would be better to read via CLI. Also, it may take more space due to
> > padding/alignment if we store fixed-width and variable-width columns
> > interleaved and similarly the access will also be slower for
> > interleaved cases.
> >
> > Having said that, can we consider an alternative way to store all
> > local_conflict_info together as a JSONB column (that can be used to
> > store an array of objects). For example, the multiple conflicting
> > tuple information can be stored as:
> >
> > [
> > { "xid": "1001", "commit_ts": "2023-10-27 10:00:00", "origin":
> > "node_A", "tuple": { "id": 1, "email": "a@b.com" } },
> > { "xid": "1005", "commit_ts": "2023-10-27 10:01:00", "origin":
> > "node_B", "tuple": { "id": 2, "phone": "555-0199" } }
> > ]
> >
> > To access JSON array columns, I think one needs to use the unnest
> > function, whereas JSONB could be accessed with something like: "SELECT
> > * FROM conflicts WHERE local_conflicts @> '[{"xid": "1001"}]".
>
> Yeah we can do that as well, maybe that's a better idea compared to
> creating separate array fields for each local element.

So I tried the POC idea with this approach and tested with one of the
test cases given by Shveta, and now the conflict log table entry looks
like this.  So we can see the local conflicts field which is an array
of JSON and each entry of the array is formed using (xid, commit_ts,
origin, json tuple).  I will send the updated patch by tomorrow after
doing some more cleanup and testing.

relid             | 16391
schemaname        | public
relname           | conf_tab
conflict_type     | multiple_unique_conflicts
remote_xid        | 761
remote_commit_lsn | 0/01761400
remote_commit_ts  | 2025-12-02 15:02:07.045935+00
remote_origin     | pg_16406
key_tuple         |
remote_tuple      | {"a":2,"b":3,"c":4}
local_conflicts   |

{"{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":2,\"b\":2,\"c\":2}}","{\"xid\":\"

773\",\"commit_ts\":\"2025-12-02T15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":3,\"b\":3,\"c\":3}}","{\"xid\":\"773\",\"commit_ts\":\"2025-12-02T
15:02:00.640253+00:00\",\"origin\":\"\",\"tuple\":{\"a\":4,\"b\":4,\"c\":4}}"}


--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: IS JSON predicate support for domain base type as JSON/JSONB/BYTEA/TEXT
Next
From: David Geier
Date:
Subject: Re: Consistently use palloc_object() and palloc_array()