Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Önder Kalacı
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CACawEhUQjxYU=UAamdnfaJU0uGJFXzhxrOT-km343kZYpeCGfw@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
List pgsql-hackers
Hi,


Few comments:
===============
1.
+   identity.  When replica identity <literal>FULL</literal> is specified,
+   indexes can be used on the subscriber side for searching the rows. These
+   indexes should be btree,

Why only btree and not others like a hash index? Also, there should be
some comments in FindUsableIndexForReplicaIdentityFull() to explain
the choices.

I updated the comment(s).

For a more technical reference, we have these restrictions, because we rely on
RelationFindReplTupleByIndex() which is designed to handle PK/RI. And,
RelationFindReplTupleByIndex() is written in a way that only expects 
indexes with these limitations.

In order to keep the changes as small as possible, I refrained from relaxing this 
limitation for now. I'm definitely up to working on this for relaxing these
limitations, and practically allowing more cases for non-unique indexes. 
 

2.
- * This is not generic routine, it expects the idxrel to be replication
- * identity of a rel and meet all limitations associated with that.
+ * This is not a generic routine - it expects the idxrel to be an index
+ * that planner would choose if the searchslot includes all the columns
+ * (e.g., REPLICA IDENTITY FULL on the source).
  */
-static bool
+static int

This comment is not clear to me. Which change here makes the
expectation like that? Which planner function/functionality are you
referring to here?

Ops, planner related comments are definitely stale. As you might 
remember, in the earlier iterations of this patch, we had some
planner functions to pick  indexes for us.

Anyway, I think even for that version of the patch, this comment was
wrong. Updated now, does that look better?
 

3.
+/*
+ * Given a relation and OID of an index, returns true if
+ * the index is relation's primary key's index or
+ * relation's replica identity index.

It seems the line length is a bit off in the above comments. There
could be a similar mismatch in other places. You might want to run
pgindent.

Makes sense, run pgindent. But it didn't fix this specific instance automatically,
I changed that manually.
 

4.
+}
+
+
+/*
+ * Returns an index oid if there is an index that can be used

Spurious empty line.

fixed
 

5.
- /*
- * We are in error mode so it's fine this is somewhat slow. It's better to
- * give user correct error.
- */
- if (OidIsValid(GetRelationIdentityOrPK(rel->localrel)))
+ /* Give user more precise error if possible. */
+ if (OidIsValid(rel->usableIndexOid))
  {
  ereport(ERROR,
  (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

Is this change valid? I mean this could lead to the error "publisher
did not send replica identity column expected by the logical
replication target relation" when it should have given an error:
"logical replication target relation \"%s.%s\" has neither REPLICA
IDENTITY index nor PRIMARY ...


 Hmm, that's right, we'd get a wrong error message. 

I spent quite a bit of time trying to understand whether we'd 
need anything additional after this patch regarding this function,
but my current understanding is that we should just leave the
check as-is. It is mainly because when REPLICA IDENTITY is
full, there is no need to check anything further (other than the
check that is at the bottom of the function)


Attached are both patches: the main patch, and the patch that
optionally disables the index scans. Let's discuss the necessity
for the second patch in the lights of the data we collect with
some more tests.

Thanks,
Onder


Attachment

pgsql-hackers by date:

Previous
From: "Drouvot, Bertrand"
Date:
Subject: Re: Minimal logical decoding on standbys
Next
From: Peter Eisentraut
Date:
Subject: Re: Move defaults toward ICU in 16?