Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date | |
Msg-id | CAHut+Pvbo-AL9x4GdXGaTnq=_xUOeaDvyTuk6sHhXZE8v+mZGA@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher (Önder Kalacı <onderkalaci@gmail.com>) |
Responses |
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
List | pgsql-hackers |
Here are some review comments for v35-0001 ====== General 1. Saying the index "should" or "should not" do this or that sounds like it is still OK but just not recommended. TO remove this ambigity IMO most of the "should" ought to be changed to "must" because IIUC this patch will simply not consider indexes which do not obey all your rules. This comment applies to a few places (search for "should") e.g.1. - Commit Message e.g.2. - /* There should always be at least one attribute for the index scan. */ e.g.3. - The function comment for FindUsableIndexForReplicaIdentityFull(Relation localrel) ====== doc/src/sgml/logical-replication.sgml 2. A published table must have a “replica identity” configured in order to be able to replicate UPDATE and DELETE operations, so that appropriate rows to update or delete can be identified on the subscriber side. By default, this is the primary key, if there is one. Another unique index (with certain additional requirements) can also be set to be the replica identity. If the table does not have any suitable key, then it can be set to replica identity “full”, which means the entire row becomes the key. When replica identity “full” is specified, indexes can be used on the subscriber side for searching the rows. Candidate indexes must be btree, non-partial, and have at least one column reference (i.e. cannot consist of only expressions). These restrictions on the non-unique index properties adheres some of the restrictions that are enforced for primary keys. Internally, we follow a similar approach for supporting index scans within logical replication scope. If there are no such suitable indexes, the search on the subscriber side can be very inefficient, therefore replica identity “full” should only be used as a fallback if no other solution is possible. If a replica identity other than “full” is set on the publisher side, a replica identity comprising the same or fewer columns must also be set on the subscriber side. See REPLICA IDENTITY for details on how to set the replica identity. If a table without a replica identity is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. INSERT operations can proceed regardless of any replica identity. ~ "adheres some of" --> "adhere to some of" ? ====== src/backend/executor/execReplication.c 3. build_replindex_scan_key { Oid operator; Oid opfamily; RegProcedure regop; - int pkattno = attoff + 1; - int mainattno = indkey->values[attoff]; - Oid optype = get_opclass_input_type(opclass->values[attoff]); + int table_attno = indkey->values[index_attoff]; + Oid optype = get_opclass_input_type(opclass->values[index_attoff]); These variable declarations might look tidier if you kept all the Oids together. ====== src/backend/replication/logical/relation.c 4. logicalrep_partition_open + /* + * Finding a usable index is an infrequent task. It occurs when an + * operation is first performed on the relation, or after invalidation of + * the relation cache entry (such as ANALYZE or CREATE/DROP index on the + * relation). + * + * We also prefer to run this code on the oldctx such that we do not + * leak anything in the LogicalRepPartMapContext (hence + * CacheMemoryContext). + */ + entry->localindexoid = FindLogicalRepLocalIndex(partrel, remoterel) "such that" --> "so that" ? ~~~ 5. IsIndexUsableForReplicaIdentityFull +bool +IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo) +{ + bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID); + bool is_partial = (indexInfo->ii_Predicate != NIL); + bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + + if (is_btree && !is_partial && !is_only_on_expression) + { + return true; + } + + return false; +} SUGGESTION (no need for 2 returns here) return is_btree && !is_partial && !is_only_on_expression; ====== src/backend/replication/logical/worker.c 6. FindReplTupleInLocalRel static bool FindReplTupleInLocalRel(EState *estate, Relation localrel, LogicalRepRelation *remoterel, Oid localidxoid, TupleTableSlot *remoteslot, TupleTableSlot **localslot) { bool found; /* * Regardless of the top-level operation, we're performing a read here, so * check for SELECT privileges. */ TargetPrivilegesCheck(localrel, ACL_SELECT); *localslot = table_slot_create(localrel, &estate->es_tupleTable); Assert(OidIsValid(localidxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)); if (OidIsValid(localidxoid)) found = RelationFindReplTupleByIndex(localrel, localidxoid, LockTupleExclusive, remoteslot, *localslot); else found = RelationFindReplTupleSeq(localrel, LockTupleExclusive, remoteslot, *localslot); return found; } ~ Since that 'found' variable is not used, you might as well remove the if/else and simplify the code. SUGGESTION static bool FindReplTupleInLocalRel(EState *estate, Relation localrel, LogicalRepRelation *remoterel, Oid localidxoid, TupleTableSlot *remoteslot, TupleTableSlot **localslot) { /* * Regardless of the top-level operation, we're performing a read here, so * check for SELECT privileges. */ TargetPrivilegesCheck(localrel, ACL_SELECT); *localslot = table_slot_create(localrel, &estate->es_tupleTable); Assert(OidIsValid(localidxoid) || (remoterel->replident == REPLICA_IDENTITY_FULL)); if (OidIsValid(localidxoid)) return RelationFindReplTupleByIndex(localrel, localidxoid, LockTupleExclusive, remoteslot, *localslot); return RelationFindReplTupleSeq(localrel, LockTupleExclusive, remoteslot, *localslot); ~~~ 7. apply_handle_tuple_routing @@ -2890,6 +2877,7 @@ apply_handle_tuple_routing(ApplyExecutionData *edata, TupleConversionMap *map; MemoryContext oldctx; LogicalRepRelMapEntry *part_entry = NULL; + AttrMap *attrmap = NULL; /* ModifyTableState is needed for ExecFindPartition(). */ The added whitespace seems unrelated to this patch. ====== src/include/replication/logicalrelation.h 8. @@ -31,6 +32,7 @@ typedef struct LogicalRepRelMapEntry Relation localrel; /* relcache entry (NULL when closed) */ AttrMap *attrmap; /* map of local attributes to remote ones */ bool updatable; /* Can apply updates/deletes? */ + Oid localindexoid; /* which index to use, or InvalidOid if none */ Indentation is not correct for that new member comment. ------ Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: