Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers
From | Önder Kalacı |
---|---|
Subject | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date | |
Msg-id | CACawEhW1q2Qx53JFDoObccP9d4HuTJxb6dA+b4qmHnJjFh68hg@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
Hi Peter, all
src/backend/replication/logical/relation.c
1. FindUsableIndexForReplicaIdentityFull
+static Oid
+FindUsableIndexForReplicaIdentityFull(Relation localrel)
+{
+ List *indexlist = RelationGetIndexList(localrel);
+ Oid usableIndex = InvalidOid;
+ ListCell *lc;
+
+ foreach(lc, indexlist)
+ {
+ Oid idxoid = lfirst_oid(lc);
+ bool isUsableIndex;
+ Relation indexRelation = index_open(idxoid, AccessShareLock);
+ IndexInfo *indexInfo = BuildIndexInfo(indexRelation);
+
+ isUsableIndex = IsIndexUsableForReplicaIdentityFull(indexInfo);
+
+ index_close(indexRelation, AccessShareLock);
+
+ if (isUsableIndex)
+ {
+ /* we found one eligible index, don't need to continue */
+ usableIndex = idxoid;
+ break;
+ }
+ }
+
+ return usableIndex;
+}
This comment is not functional -- if you prefer the code as-is, then
ignore this comment.
But, personally I would:
- Move some of that code from the declarations. I feel it would be
better if the index_open/index_close were both in the code-body
instead of half in declarations and half not.
- Remove the 'usableIndex' variable, and just return directly.
- Shorten all the long names (and use consistent 'idx' instead of
sometimes 'idx' and sometimes 'index')
SUGGESTION (YMMV)
static Oid
FindUsableIndexForReplicaIdentityFull(Relation localrel)
{
List *idxlist = RelationGetIndexList(localrel);
ListCell *lc;
foreach(lc, idxlist)
{
Oid idxoid = lfirst_oid(lc);
bool isUsableIdx;
Relation idxRel;
IndexInfo *idxInfo;
idxRel = index_open(idxoid, AccessShareLock);
idxInfo = BuildIndexInfo(idxRel);
isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo);
index_close(idxRel, AccessShareLock);
/* Return the first eligible index found */
if (isUsableIdx)
return idxoid;
}
return InvalidOid;
}
applied your suggestion. I think it made it slightly easier to follow.
======
.../subscription/t/032_subscribe_use_index.pl
2. SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES
2a.
# Testcase start: SUBSCRIPTION CREATE/DROP INDEX WORKS WITHOUT ISSUES
#
# This test ensures that after CREATE INDEX, the subscriber can automatically
# use one of the indexes (provided that it fulfils the requirements).
# Similarly, after DROP index, the subscriber can automatically switch to
# sequential scan
The last sentence is missing full-stop.
fixed
~
2b.
# now, create index and see that the index is used
$node_subscriber->safe_psql('postgres',
"CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x)");
Don't say "and see that the index is used". Yes, that is what this
whole test is doing, but it is not what the psql following this
comment is doing.
true
~
2c.
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x = 15;");
$node_publisher->wait_for_catchup($appname);
# wait until the index is used on the subscriber
The double blank lines here should be single.
~
fixed,
2d.
# now, the update could either use the test_replica_id_full_idx or
# test_replica_id_full_idy index it is not possible for user to control
# which index to use
This sentence should break at "it".
Aso "user" --> "the user"
SUGGESTION
# now, the update could either use the test_replica_id_full_idx or
# test_replica_id_full_idy index; it is not possible for the user to control
# which index to use
looks good, thanks
~
2e.
# let's also test dropping test_replica_id_full_idy and
# hence use test_replica_id_full_idx
I think you ought to have dropped the other (first) index because we
already know that the first index had been used (from earlier), but we
are not 100% sure if the 'y' index has been chosen yet.
make sense. Though in general it is hard to check pg_stat_all_indexes
for any of the indexes on this test, as we don't know the exact number
of tuples for each. Just wanted to explicitly note
~~~~
3. SUBSCRIPTION USES INDEX WITH MULTIPLE ROWS AND COLUMNS
3a.
# deletes 20 rows
$node_publisher->safe_psql('postgres',
"DELETE FROM test_replica_id_full WHERE x IN (5, 6);");
# updates 20 rows
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2);");
"deletes" --> "delete"
"updates" --> "update"
Well, I thought the command deletes but I guess delete looks better
~~~
4. SUBSCRIPTION USES INDEX WITH DROPPED COLUMNS
# updates 200 rows
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x IN (5, 6);");
"updates" --> "update"
"200 rows" ??? is that right -- 20 maybe ???
I guess this is from an earlier version of the patch, I fixed these types
of errors.
~~~
5. SUBSCRIPTION USES INDEX ON PARTITIONED TABLES
5a.
# updates rows and moves between partitions
$node_publisher->safe_psql('postgres',
"UPDATE users_table_part SET value_1 = 0 WHERE user_id = 4;");
"updates rows and moves between partitions" --> "update rows, moving
them to other partitions"
fixed, thanks
~
5b.
# deletes rows from different partitions
"deletes" --> "delete"
fixed, and searched for similar errors but couldn't see any more
~~~
6. SUBSCRIPTION DOES NOT USE INDEXES WITH ONLY EXPRESSIONS OR PARTIAL INDEX
6a.
# update 2 rows
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname = 'first_name_1';");
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname =
'first_name_2' AND lastname = 'last_name_2';");
IMO 'Nan' seemed a curious name to assign as a test value, because it
seems like it has a special meaning but in reality, I don't think it
does. Even 'xxx' would be better.
changed to "no-name" as "xxx" also looks not so good
~
6b.
# make sure none of the indexes is not used on the subscriber
$result = $node_subscriber->safe_psql('postgres',
"select sum(idx_scan) from pg_stat_all_indexes where indexrelname
IN ('people_names_expr_only', 'people_names_partial')");
is($result, qq(0), 'ensure subscriber tap_sub_rep_full updates two
rows via seq. scan with index on expressions');
~
Looks like a typo in this comment: "none of the indexes is not used" ???
dropped "not"
~~~
7. SUBSCRIPTION CAN USE INDEXES WITH EXPRESSIONS AND COLUMNS
7a.
# update 2 rows
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname = 'first_name_1';");
$node_publisher->safe_psql('postgres',
"UPDATE people SET firstname = 'Nan' WHERE firstname =
'first_name_3' AND lastname = 'last_name_3';");
Same as #6a, 'Nan' seems like a strange test value to assign to a name.
similarly, changed to no-name
~~~
8. Some NULL values
$node_publisher->safe_psql('postgres',
"INSERT INTO test_replica_id_full VALUES (1), (2), (3);");
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x = 1;");
$node_publisher->safe_psql('postgres',
"UPDATE test_replica_id_full SET x = x + 1 WHERE x = 3;");
~
8a.
For some reason, this insert/update psql was not commented.
added some
~
8b.
Maybe this test data could be more obvious by explicitly inserting the NULLs?
Well, that's a bit hard. We merged a few tests for perf reasons. And, I merged this
test with "missing column" test. Now, the NULL values are triggered due to
missing column on the source.
~~~
9. Unique index that is not primary key or replica identity
9a.
Why are other "Testcase start" comments all uppercase but not this one?
fixed, there was one more
~~~
10. SUBSCRIPTION USES INDEX WITH PUB/SUB different data
Why is there a mixed case in this "Test case:" comment?
no specific reason, fixed
~~~
11. PUBLICATION LACKS THE COLUMN ON THE SUBS INDEX
11a.
# The subsriber has an additional column compared to publisher,
# and the index is on that column. We still pick the index scan
# on the subscriber even though it is practically similar to
# sequential scan
Typo "subsriber"
I guess I fixed this in a recent iteration, I cannot find it.
Missing full-stop on last sentence.
Similarly, probably merged into another test.
Still went over the all such explanations in the test, and ensured
we have the full stop
~
11b.
# make sure that the subscriber has the correct data
# we only deleted 1 row
$result = $node_subscriber->safe_psql('postgres',
"SELECT sum(x) FROM test_replica_id_full");
is($result, qq(232), 'ensure subscriber has the correct data at the
end of the test');
Why does that say "deleted 1 row", when the previous operation was not a DELETE?
Probably due to merging multiple tests into one. Fixed now.
Again, thanks for thorough review. Attached v41.
See the reason for 0001 patch in [1].
Onder KALACI.
Attachment
pgsql-hackers by date: