Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers
| From | Amit Kapila |
|---|---|
| Subject | Re: Adding REPACK [concurrently] |
| Date | |
| Msg-id | CAA4eK1JhuT5fyTosWDZ+Pgs+j7xEjObTyRMn80uNKgi_ivqHbw@mail.gmail.com Whole thread |
| In response to | Re: Adding REPACK [concurrently] (Andres Freund <andres@anarazel.de>) |
| Responses |
RE: Adding REPACK [concurrently]
|
| List | pgsql-hackers |
On Tue, Apr 7, 2026 at 9:18 PM Andres Freund <andres@anarazel.de> wrote: > > On 2026-04-07 17:38:24 +0200, Antonin Houska wrote: > > Andres Freund <andres@anarazel.de> wrote: > > > > > On 2026-04-07 14:33:50 +0200, Alvaro Herrera wrote: > > > > On 2026-Apr-07, Amit Kapila wrote: > > > > > > > > > I have a question based on 0001's commit message: "This patch adds a > > > > > new option to logical replication output plugin, to declare that it > > > > > does not use shared catalogs (i.e. catalogs that can be changed by > > > > > transactions running in other databases in the cluster).". In which > > > > > cases, currently plugin needs to access multi-database transactions or > > > > > transactions that need to access shared catalogs and on what basis a > > > > > plugin can decide that the changes it requires won't need any such > > > > > access. > > > > > > > > I don't think any plugin needs "multi-database" access as such, but > > > > needing access to shared catalogs is likely normal. Repack knows it > > > > won't access any shared catalogs, so it can set the flag at ease. > > > > > > > > There's a cross-check added in the commit that tests for access to > > > > shared catalogs if the flag is set to false. I guess you could set it > > > > to false and see what breaks :-) > > > > > > I think this has a quite high chance of indirect breakages. You just need some > > > cache invalidation processing / building accessing shared catalogs to violate > > > the rule, and whether that happens very heavily depends on what cache entries > > > are present and whether something registers relcache callbacks or such. > > > > > > This can be triggered by an output function during logical decoding or such, > > > so you don't really have control over it. > > > > The REPACK plugin only deforms tuples and writes them to a file, so I think > > that things like this should not happen. > > You don't need to do it yourself. It just requires a shared_preload_library > extension to register a relcache invalidation callback that accesses shared > catalog. > > It's only kind of an accident that we don't have a case today that accesses > shared catcaches during a relcache build in core (I'm not even sure there's > nothing). You'd just need somebody to add e.g. relcache caching for > publications for that to change. Or look up information about a reloption in > pg_parameter_acl. Or lookup tablespace configuration. > > > > However, I admit that an option that allows the plugin developer to declare > > "I don't need shared catalogs" may be considered deceptive. > > At the very least it would need to be a runtime check rather than just an > assert. This would much more likely to be hit in production because otherwise > it's probably hard to hit the case where shared invalidations happen in the > wrong moment. And the consequences are corrupted caches, which could cause > all kinds of havoc. > > > But I think this may need more infrastructure / deeper analysis than what we > can do right now. > We still have not decided on this point. The code corresponding to this part [1] has an open bug as well [2]. Based on the discussion here, I don't see we have a consensus with the current approach and even if we want to go with the current approach it seems that we need more work which needs further analysis. Alvaro, others, what is your take on this? [1]: commit 0d3dba38c777384a9dd7dffe924355c9683a6b71: Allow logical replication snapshots to be database-specific [2]: https://www.postgresql.org/message-id/CAHg%2BQDcQak4jx_6X2_Ws98rzG%3DxBARLjqm_%3D56wTRUtNsY4DZQ%40mail.gmail.com -- With Regards, Amit Kapila.
pgsql-hackers by date: