RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL - Mailing list pgsql-hackers

From Hayato Kuroda (Fujitsu)
Subject RE: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL
Date
Msg-id TYAPR01MB58666F301B3D284B77DE7633F52DA@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL  (Peter Smith <smithpb2250@gmail.com>)
List pgsql-hackers
Dear Peter,

Thanks for reviewing. PSA new version.
I planned to post new version after supporting more indexes, but it may take more time.
So I want to address comments from you once.

> ======
> src/backend/executor/execReplication.c
> 
> 1. get_equal_strategy_number
> 
> +/*
> + * Return the appropriate strategy number which corresponds to the equality
> + * comparisons.
> + *
> + * TODO: support other indexes: GiST, GIN, SP-GiST, BRIN
> + */
> +static int
> +get_equal_strategy_number(Oid opclass)
> +{
> + Oid am_method = get_opclass_method(opclass);
> + int ret;
> +
> + switch (am_method)
> + {
> + case BTREE_AM_OID:
> + ret = BTEqualStrategyNumber;
> + break;
> + case HASH_AM_OID:
> + ret = HTEqualStrategyNumber;
> + break;
> + case GIST_AM_OID:
> + case GIN_AM_OID:
> + case SPGIST_AM_OID:
> + case BRIN_AM_OID:
> + /* TODO */
> + default:
> + /* XXX: Do we have to support extended indexes? */
> + ret = InvalidStrategy;
> + break;
> + }
> +
> + return ret;
> +}
> 
> 1a.
> In the file syscache.c there are already some other functions like
> get_op_opfamily_strategy so I am wondering if this new function also
> really belongs in that file.

According to atop comment in the syscache.c, it contains routines which access
system catalog cache. get_equal_strategy_number() does not check it, so I don't
think it should be at the file.

> 1b.
> Should that say "operator" instead of "comparisons"?

Changed.

> 1c.
> "am" stands for "access method" so "am_method" is like "access method
> method" – is it correct?

Changed to "am".

> 2. build_replindex_scan_key
> 
>   int table_attno = indkey->values[index_attoff];
> + int strategy_number;
> 
> 
> Ought to say this is a strategy for "equality", so a better varname
> might be "equality_strategy_number" or "eq_strategy" or similar.

Changed to "eq_strategy".

> src/backend/replication/logical/relation.c
> 
> 3. IsIndexUsableForReplicaIdentityFull
> 
> It seems there is some overlap between this code which hardwired 2
> valid AMs and the switch statement in your other
> get_equal_strategy_number function which returns a strategy number for
> those 2 AMs.
> 
> Would it be better to make another common function like
> get_equality_strategy_for_am(), and then you don’t have to hardwire
> anything? Instead, you can say:
> 
> is_usable_type = get_equality_strategy_for_am(indexInfo->ii_Am) !=
> InvalidStrategy;

Added. How do you think?

> src/backend/utils/cache/lsyscache.c
> 
> 4. get_opclass_method
> 
> +/*
> + * get_opclass_method
> + *
> + * Returns the OID of the index access method operator family is for.
> + */
> +Oid
> +get_opclass_method(Oid opclass)
> +{
> + HeapTuple tp;
> + Form_pg_opclass cla_tup;
> + Oid result;
> +
> + tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
> + if (!HeapTupleIsValid(tp))
> + elog(ERROR, "cache lookup failed for opclass %u", opclass);
> + cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
> +
> + result = cla_tup->opcmethod;
> + ReleaseSysCache(tp);
> + return result;
> +}
> 
> Is the comment correct? IIUC, this function is not doing anything for
> "families".

Modified to "class".

> src/test/subscription/t/034_hash.pl
> 
> 5.
> +# insert some initial data within the range 0-9 for y
> +$node_publisher->safe_psql('postgres',
> + "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM
> generate_series(0,10) i"
> +);
> 
> Why does the comment only say "for y"?

After considering more, I thought we do not have to mention data.
So removed the part " within the range 0-9 for y".

> 6.
> +# wait until the index is used on the subscriber
> +# XXX: the test will be suspended here
> +$node_publisher->wait_for_catchup($appname);
> +$node_subscriber->poll_query_until('postgres',
> + q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname
> = 'test_replica_id_full_idx';}
> +  )
> +  or die
> +  "Timed out while waiting for check subscriber tap_sub_rep_full
> updates 4 rows via index";
> +
> 
> AFAIK this is OK but it was slightly misleading because it says
> "updates 4 rows" whereas the previous UPDATE was only for 2 rows. So
> here I think the 4 also counts the 2 DELETED rows. The comment can be
> expanded slightly to clarify this.

Clarified two rows were deleted.

> 7.
> FYI, when I ran the TAP test the result was this:
> 
> t/034_hash.pl ...................... 1/? # Tests were run but no plan
> was declared and done_testing() was not seen.
> t/034_hash.pl ...................... All 2 subtests passed
> t/100_bugs.pl ...................... ok
> 
> Test Summary Report
> -------------------
> t/034_hash.pl                    (Wstat: 0 Tests: 2 Failed: 0)
>   Parse errors: No plan found in TAP output
> Files=36, Tests=457, 338 wallclock secs ( 0.29 usr  0.07 sys + 206.73
> cusr 47.94 csys = 255.03 CPU)
> Result: FAIL
> 
> ~
> 
> Just adding the missing done_testing() seemed to fix this.

Right. I used meson build system and it said OK. Added.

Furthermore, I added a check to reject indexes which do not implement "amgettuple" API.
More detail, please see [1].

[1]:
https://www.postgresql.org/message-id/TYAPR01MB5866E02638D40C4D198334B4F52DA%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: pg_upgrade and cross-library upgrades
Next
From: Alexander Lakhin
Date:
Subject: Re: benchmark results comparing versions 15.2 and 16