Re: Selectively invalidate caches in pgoutput module - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Selectively invalidate caches in pgoutput module |
Date | |
Msg-id | CAA4eK1LHoHjZ1mbuf1Kya5eHZXyeO7Xi+XEiRFVf=-YMdPAY+g@mail.gmail.com Whole thread Raw |
In response to | Selectively invalidate caches in pgoutput module ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
List | pgsql-hackers |
On Mon, Mar 3, 2025 at 1:27 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Hi, this is a fork thread from [1]. I want to propose a small optimization for > logical replication system. > > Background > ========== > > When the ALTER PUBLICATION command is executed, all entries in RelationSyncCache > will be discarded anyway. This mechanism works well but is sometimes not efficient. > For example, when the ALTER PUBLICATION DROP TABLE is executed, > 1) the specific entry in RelationSyncCache will be removed, and then > 2) all entries will be discarded twice. > In (2) Why twice? Is it because we call rel_sync_cache_publication_cb() first for PUBLICATIONRELMAP and then for PUBLICATIONOID via publication_invalidation_cb()? > This happens because the pgoutput plugin registers both RelcacheCallback > (rel_sync_cache_relation_cb) and SyscacheCallback (publication_invalidation_cb, > rel_sync_cache_publication_cb). Then, when ALTER PUBLICATION ADD/SET/DROP is executed, > both the relation cache of added tables and the syscache of pg_publication_rel and > pg_publication are invalidated. > The callback for the relation cache will remove an entry from the hash table, and > syscache callbacks will look up all entries and invalidate them. However, AFAICS > does not need to invalidate all of them. > IIUC, you want to say in the above case only (1) would be sufficient, right? > I grepped source codes and found this happens since the initial version. > > Currently the effect of the behavior may not be large, > Is it possible to see the impact? I guess if there are a large number of relations (or partitions), then all will get invalidated even if one relation is added/dropped from the publication; if so, this should impact the performance. > but [1] may affect > significantly because it propagates invalidation messages to all in-progress > decoding transactions. > > Patch overview > ============ > > Based on the background, the patch avoids dropping all entries in RelationSyncCache > when ALTER PUBLICATION is executed. It removes sys cache callbacks for pg_publication_rel > and pg_publication_namespace and avoids discarding entries in sys cache for pg_publication. > > Apart from the above, this patch also ensures that relcaches of publishing tables > are invalidated when ALTER PUBLICATION is executed. ADD/SET/DROP already has this > mechanism, but ALTER PUBLICATION OWNER TO and RENAME TO do not. > Regarding RENAME TO, now we are using a common function, but it is replaced with > RenamePublication() to do invalidations. > For Rename/Owner, can we use a new invalidation instead of using relcache invalidation, as that can impact other sessions for no benefit? IIUC, changing the other properties of publication like dropping the table requires us to invalidate even the corresponding relcache entry because it contains the publication descriptor (rd_pubdesc). See RelationBuildPublicationDesc(). Also, it is better to consider doing rename/owner_change related changes in a separate patch as that will make the first patch easier to review and commit. -- With Regards, Amit Kapila.
pgsql-hackers by date: