Re: row filtering for logical replication - Mailing list pgsql-hackers
From | Euler Taveira |
---|---|
Subject | Re: row filtering for logical replication |
Date | |
Msg-id | 532a18d8-ce90-4444-8570-8a9fcf09f329@www.fastmail.com Whole thread Raw |
In response to | Re: row filtering for logical replication (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: row filtering for logical replication
("Euler Taveira" <euler@eulerto.com>)
Re: row filtering for logical replication (Amit Kapila <amit.kapila16@gmail.com>) Re: row filtering for logical replication (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Fri, Jul 2, 2021, at 4:29 AM, Peter Smith wrote:
Hi.I have been looking at the latest patch set (v16). Below are my reviewcomments and some patches.
Peter, thanks for your detailed review. Comments are inline.
1. Patch 0001 comment - typoyou can optionally filter rows that does not satisfy a WHERE conditiontypo: does/does
Fixed.
2. Patch 0001 comment - typoThe WHERE clause should probably contain only columns that are part ofthe primary key or that are covered by REPLICA IDENTITY. Otherwise,and DELETEs won't be replicated.typo: "Otherwise, and DELETEs" ??
Fixed.
3. Patch 0001 comment - typo and clarificationIf your publication contains partitioned table, the parameterpublish_via_partition_root determines if it uses the partition row filter (ifthe parameter is false -- default) or the partitioned table row filter.Typo: "contains partitioned table" -> "contains a partitioned table"
Fixed.
Also, perhaps the text "or the partitioned table row filter." shouldsay "or the root partitioned table row filter." to disambiguate thecase where there are more levels of partitions like A->B->C. e.g. Whatfilter does C use?
I agree it can be confusing. BTW, CREATE PUBLICATION does not mention that the
root partitioned table is used. We should improve that sentence too.
4. src/backend/catalog/pg_publication.c - misleading names-publication_add_relation(Oid pubid, Relation targetrel,+publication_add_relation(Oid pubid, PublicationRelationInfo *targetrel,bool if_not_exists)Leaving this parameter name as "targetrel" seems a bit misleading nowin the function code. Maybe this should be called something like "pri"which is consistent with other places where you have declaredPublicationRelationInfo.Also, consider declaring some local variables so that the patch mayhave less impact on existing code. e.g.Oid relid = pri->relidRelation *targetrel = relationinfo->relation
Done.
5. src/backend/commands/publicationcmds.c - simplify code- rels = OpenTableList(stmt->tables);+ if (stmt->tableAction == DEFELEM_DROP)+ rels = OpenTableList(stmt->tables, true);+ else+ rels = OpenTableList(stmt->tables, false);Consider writing that code more simply as just:rels = OpenTableList(stmt->tables, stmt->tableAction == DEFELEM_DROP);
It is not a common pattern to use an expression as a function argument in
Postgres. I prefer to use a variable with a suggestive name.
6. src/backend/commands/publicationcmds.c - bug?- CloseTableList(rels);+ CloseTableList(rels, false);}Is this a potential bug? When you called OpenTableList the 2nd paramwas maybe true/false, so is it correct to be unconditionally falsehere? I am not sure.
Good catch.
7. src/backend/commands/publicationcmds.c - OpenTableList function comment.* Open relations specified by a RangeVar list.+ * AlterPublicationStmt->tables has a different list element, hence, is_drop+ * indicates if it has a RangeVar (true) or PublicationTable (false).* The returned tables are locked in ShareUpdateExclusiveLock mode in order to* add them to a publication.I am not sure about this. Should that comment instead say "indicatesif it has a Relation (true) or PublicationTable (false)"?
Fixed.
8. src/backend/commands/publicationcmds.c - OpenTableList
>8
For some reason it feels kind of clunky to me for this function to beprocessing the list differently according to the 2nd param. e.g. thename "is_drop" seems quite unrelated to the function code, and more todo with where it was called from. Sorry, I don't have any better ideasfor improvement atm.
My suggestion is to rename it to "pub_drop_table".
9. src/backend/commands/publicationcmds.c - OpenTableList bug?
>8
I felt maybe this is a possible bug here because there seems no codeexplicitly assigning the whereClause = NULL if "is_drop" is true somaybe it can have a garbage value which could cause problems later.Maybe this is fixed by using palloc0.
Fixed.
10. src/backend/commands/publicationcmds.c - CloseTableList function comment
>8
Probably the meaning of "is_drop" should be described in this function comment.
Done.
11. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry signature.-static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Oid relid);+static RelationSyncEntry *get_rel_sync_entry(PGOutputData *data, Relation rel);I see that this function signature is modified but I did not see howthis parameter refactoring is actually related to the RowFilter patch.Perhaps I am mistaken, but IIUC this only changes the relid =RelationGetRelid(rel); to be done inside this function instead ofbeing done outside by the callers.
It is not critical for this patch so I removed it.
12. src/backend/replication/pgoutput/pgoutput.c - missing function commentsThe static functions create_estate_for_relation andpgoutput_row_filter_prepare_expr probably should be commented.
Done.
13. src/backend/replication/pgoutput/pgoutput.c -pgoutput_row_filter_prepare_expr function name+static ExprState *pgoutput_row_filter_prepare_expr(Node *rfnode,EState *estate);This function has an unfortunate name with the word "prepare" in it. Iwonder if a different name can be found for this function to avoid anyconfusion with pgoutput functions (coming soon) which are related tothe two-phase commit "prepare".
The word "prepare" is related to the executor context. The function name
contains "row_filter" that is sufficient to distinguish it from any other
function whose context is "prepare". I replaced "prepare" with "init".
14. src/bin/psql/describe.c+ if (!PQgetisnull(tabres, j, 2))+ appendPQExpBuffer(&buf, " WHERE (%s)",+ PQgetvalue(tabres, j, 2));Because the where-clause value already has enclosing parentheses sousing " WHERE (%s)" seems overkill here. e.g. you can see the effectin your src/test/regress/expected/publication.out file. I think thisshould be changed to " WHERE %s" to give better output.
Peter E suggested that extra parenthesis be added. See 0005 [1].
15. src/include/catalog/pg_publication.h - new typedef+typedef struct PublicationRelationInfo+{+ Oid relid;+ Relation relation;+ Node *whereClause;+} PublicationRelationInfo;+The new PublicationRelationInfo should also be addedsrc/tools/pgindent/typedefs.list
Patches usually don't update typedefs.list. Check src/tools/pgindent/README.
16. src/include/nodes/parsenodes.h - new typedef+typedef struct PublicationTable+{+ NodeTag type;+ RangeVar *relation; /* relation to be published */+ Node *whereClause; /* qualifications */+} PublicationTable;The new PublicationTable should also be added src/tools/pgindent/typedefs.list
Idem.
17. sql/publication.sql - show more output+CREATE PUBLICATION testpub5 FOR TABLE testpub_rf_tbl1,testpub_rf_tbl2 WHERE (c <> 'test' AND d < 5);+RESET client_min_messages;+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl3 WHERE (e > 1000AND e < 2000);+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl2;+-- remove testpub_rf_tbl1 and add testpub_rf_tbl3 again (anotherWHERE expression)+ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (e > 300AND e < 500);+-- fail - functions disallowed+ALTER PUBLICATION testpub5 ADD TABLE testpub_rf_tbl4 WHERE (length(g) < 6);+-- fail - WHERE not allowed in DROP+ALTER PUBLICATION testpub5 DROP TABLE testpub_rf_tbl3 WHERE (e < 27);+\dRp+ testpub5I felt that it would be better to have a "\dRp+ testpub5" after eachof the valid ALTER PUBLICATION steps to show the intermediate resultsalso; not just the final one at the end.
Done.
18. src/test/subscription/t/020_row_filter.pl - rename fileI think this file should be renamed to 021_row_filter.pl as there isalready an 020 TAP test present.
Done.
19. src/test/subscription/t/020_row_filter.pl - test commentsAFAIK the test cases are all OK, but it was really quite hard toreview these TAP tests to try to determine what the expected resultsshould be.
I included your comments but heavily changed it.
20. src/test/subscription/t/020_row_filter.pl - missing test case?There are some partition tests, but I did not see any test that waslike 3 levels deep like A->B->C, so I was not sure if there is anycase C would ever make use of the filter of its parent B, or would itonly use the filter of the root A?
I didn't include it yet. There is an issue with initial synchronization and
partitioned table when you set publish_via_partition_root. I'll start another
thread for this issue.
21. src/test/subscription/t/020_row_filter.pl - missing test case?If the same table is in multiple publications they can each have a rowfilter. And a subscription might subscribe to some but not all ofthose publications. I think this scenario is only partly tested.
8<
e.g.pub_1 has tableX with RowFilter1pub_2 has tableX with RowFilter2Then sub_12 subscribes to pub_1, pub_2This is already tested in your TAP test (I think) and it makes sureboth filters are appliedBut if there was alsopub_3 has tableX with RowFilter3Then sub_12 still should only be checking the filtered RowFilter1 ANDRowFilter2 (but NOT row RowFilter3). I think this scenario is nottested.
I added a new publication tap_pub_not_used to cover this case.
POC PATCH FOR PLAN CACHE========================PSA a POC patch for a plan cache which gets used inside thepgoutput_row_filter function instead of calling prepare for every row.I think this is implementing something like Andes was suggesting awhile back [1].
I also had a WIP patch for it (that's very similar to your patch) so I merged
it.
This cache mechanism consists of caching ExprState and avoid calling
pgoutput_row_filter_init_expr() for every single row. Greg N suggested in
another email that tuple table slot should also be cached to avoid a few cycles
too. It is also included in this new patch.
Measurements with/without this plan cache:Time spent processing within the pgoutput_row_filter function- Data was captured using the same technique as the0002-Measure-row-filter-overhead.patch.- Inserted 1000 rows, sampled data for the first 100 times in this function.not cached: average ~ 28.48 uscached: average ~ 9.75 usReplication times:- Using tables and row filters same as in Onder's commands_to_test_perf.sql [2]100K rows - not cached: ~ 42sec, 43sec, 44sec100K rows - cached: ~ 41sec, 42sec, 42 sec.There does seem to be a tiny gain achieved by having the plan cache,but I think the gain might be a lot less than what people wereexpecting.
I did another measure using as baseline the previous patch (v16).
without cache (v16)
---------------------------
mean: 1.46 us
stddev: 2.13 us
median: 1.39 us
min-max: [0.69 .. 1456.69] us
percentile(99): 3.15 us
mode: 0.91 us
with cache (v18)
-----------------------
mean: 0.63 us
stddev: 1.07 us
median: 0.55 us
min-max: [0.29 .. 844.87] us
percentile(99): 1.38 us
mode: 0.41 us
It represents -57%. It is a really good optimization for just a few extra lines
of code.
Attachment
pgsql-hackers by date: