Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers

From Önder Kalacı
Subject Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date
Msg-id CACawEhWvt_LOcEUivB_-GhkG9tm8gqfGNm-aaveGhs7cGjhszg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher  (Önder Kalacı <onderkalaci@gmail.com>)
List pgsql-hackers
Hi,

2.
@@ -57,9 +60,6 @@ build_replindex_scan_key(ScanKey skey, Relation rel,
Relation idxrel,
  int2vector *indkey = &idxrel->rd_index->indkey;
  bool hasnulls = false;

- Assert(RelationGetReplicaIndex(rel) == RelationGetRelid(idxrel) ||
-    RelationGetPrimaryKeyIndex(rel) == RelationGetRelid(idxrel));

You have removed this assertion but there is a comment ("This is not
generic routine, it expects the idxrel to be replication identity of a
rel and meet all limitations associated with that.") atop this
function which either needs to be changed/removed and probably we
should think if the function needs some change after removing that
restriction.


Ack, I can see your point. I think, for example, we should skip index attributes that are not simple column references. And, probably whatever other restrictions that PRIMARY has, should be here.

Primary keys require:
- Unique: We don't need uniqueness, that's the point of this patch
- Valid index: Should not be an issue in this case, because planner would not pick non-valid index anyway.
- Non-Partial index: As discussed earlier in this thread, I really don't see any problems with partial indexes for this use-case. Please let me know if there is anything I miss.
- Deferrable - Immediate: As far as I can see, there is no such concepts for regular indexes, so does not apply here 
- Indexes with no expressions: This is the point where we require some minor changes inside/around `build_replindex_scan_key `. Previously, indexes on expressions could not be replica indexes. And, with this patch they can. However, the expressions cannot be used for filtering the tuples because of the way we create the restrictinfos. We essentially create  `WHERE col_1 = $1 AND col_2 = $2 .. col_n = $n` for the columns with equality operators available. In the case of expressions on the indexes, the planner would never pick such indexes with these restrictions. I changed `build_replindex_scan_key ` to reflect that, added a new assert and pushed tests with the following schema, and make sure the code behaves as expected:

CREATE TABLE people (firstname text, lastname text);
CREATE INDEX people_names_expr_only ON people ((firstname || ' ' || lastname));
CREATE INDEX people_names_expr_and_columns ON people ((firstname || ' ' || lastname), firstname, lastname);

Also did similar tests with indexes on jsonb fields. Does that help you avoid the concerns regarding indexes with expressions? 

I'll work on one of the other open items in the thread (e.g., analyze invalidation callback) separately.

Thanks,
Onder KALACI

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Maximize page freezing
Next
From: Tom Lane
Date:
Subject: Re: generic plans and "initial" pruning