RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers
From | shiy.fnst@fujitsu.com |
---|---|
Subject | RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date | |
Msg-id | OSZPR01MB63103A4AFBBA56BAF8AE7FAAFDA39@OSZPR01MB6310.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
|
List | pgsql-hackers |
On Sat, Feb 4, 2023 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Feb 2, 2023 at 2:03 PM Önder Kalacı <onderkalaci@gmail.com> wrote: > > > >> > >> and if there's more > >> than one candidate index pick any one. Additionally, we can allow > >> disabling the use of an index scan for this particular case. If we are > >> too worried about API change for allowing users to specify the index > >> then we can do that later or as a separate patch. > >> > > > > On v23, I dropped the planner support for picking the index. Instead, it simply > > iterates over the indexes and picks the first one that is suitable. > > > > I'm currently thinking on how to enable users to override this decision. > > One option I'm leaning towards is to add a syntax like the following: > > > > ALTER SUBSCRIPTION .. ALTER TABLE ... SET INDEX ... > > > > Though, that should probably be a seperate patch. I'm going to work > > on that, but still wanted to share v23 given picking the index sounds > > complementary, not strictly required at this point. > > > > I agree that it could be a separate patch. However, do you think we > need some way to disable picking the index scan? This is to avoid > cases where sequence scan could be better or do we think there won't > exist such a case? > I think such a case exists. I tried the following cases based on v23 patch. # Step 1. Create publication, subscription and tables. -- on publisher create table tbl (a int); alter table tbl replica identity full; create publication pub for table tbl; -- on subscriber create table tbl (a int); create index idx_a on tbl(a); create subscription sub connection 'dbname=postgres port=5432' publication pub; # Step 2. Setup synchronous replication. # Step 3. Execute SQL query on publisher. -- case 1 (All values are duplicated) truncate tbl; insert into tbl select 1 from generate_series(0,10000)i; update tbl set a=a+1; -- case 2 truncate tbl; insert into tbl select i%3 from generate_series(0,10000)i; update tbl set a=a+1; -- case 3 truncate tbl; insert into tbl select i%5 from generate_series(0,10000)i; update tbl set a=a+1; -- case 4 truncate tbl; insert into tbl select i%10 from generate_series(0,10000)i; update tbl set a=a+1; -- case 5 truncate tbl; insert into tbl select i%100 from generate_series(0,10000)i; update tbl set a=a+1; -- case 6 truncate tbl; insert into tbl select i%1000 from generate_series(0,10000)i; update tbl set a=a+1; -- case 7 (No duplicate value) truncate tbl; insert into tbl select i from generate_series(0,10000)i; update tbl set a=a+1; # Result The time executing update (the average of 3 runs is taken, the unit is milliseconds): +--------+---------+---------+ | | patched | master | +--------+---------+---------+ | case 1 | 3933.68 | 1298.32 | | case 2 | 1803.46 | 1294.42 | | case 3 | 1380.82 | 1299.90 | | case 4 | 1042.60 | 1300.20 | | case 5 | 691.69 | 1297.51 | | case 6 | 578.50 | 1300.69 | | case 7 | 566.45 | 1302.17 | +--------+---------+---------+ In case 1~3, there's an overhead after applying the patch. In other cases, the patch improved the performance. As more duplicate values, the greater the overhead after applying the patch. Regards, Shi Yu
pgsql-hackers by date: