Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Minimal logical decoding on standbys |
Date | |
Msg-id | CA+TgmoY0df9X+5ENg8P0BGj0odhM45sdQ7kB4JMo4NKaoFy-Vg@mail.gmail.com Whole thread Raw |
In response to | Re: Minimal logical decoding on standbys ("Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>) |
Responses |
Re: Minimal logical decoding on standbys
|
List | pgsql-hackers |
On Sat, Dec 10, 2022 at 3:09 AM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > Attaching V30, mandatory rebase due to 66dcb09246. It's a shame that this hasn't gotten more attention, because the topic is important, but I'm as guilty of being too busy to spend a lot of time on it as everyone else. Anyway, while I'm not an expert on this topic, I did spend a little time looking at it today, especially 0001. Here are a few comments: I think that it's not good for IndexIsAccessibleInLogicalDecoding and RelationIsAccessibleInLogicalDecoding to both exist. Indexes and tables are types of relations, so this invites confusion: when the object in question is an index, it would seem that either one can be applied, based on the names. I think the real problem here is that RelationIsAccessibleInLogicalDecoding is returning *the wrong answer* when the relation is a user-catalog table. It does so because it relies on RelationIsUsedAsCatalogTable, and that macro relies on checking whether the reloptions include user_catalog_table. But here we can see where past thinking of this topic has been, perhaps, a bit fuzzy. If that option were called user_catalog_relation and had to be set on both tables and indexes as appropriate, then RelationIsAccessibleInLogicalDecoding would already be doing the right thing, and consequently there would be no need to add IndexIsAccessibleInLogicalDecoding. I think we should explore the idea of making the existing macro return the correct answer rather than adding a new one. It's probably too late to redefine the semantics of user_catalog_table, although if anyone wants to argue that we could require logical decoding plugins to set this for both indexes and tables, and/or rename to say relation instead of table, and/or add a parallel reloption called user_catalog_index, then let's talk about that. Otherwise, I think we can consider adjusting the definition of RelationIsUsedAsCatalogTable. The simplest way to do that would be to make it check indisusercatalog for indexes and do what it does already for tables. Then IndexIsUserCatalog and IndexIsAccessibleInLogicalDecoding go away and RelationIsAccessibleInLogicalDecoding returns the right answer in all cases. But I also wonder if a new pg_index column is really the right approach here. One fairly obvious alternative is to try to use the user_catalog_table reloption in both places. We could try to propagate that reloption from the table to its indexes; whenever it's set or unset on the table, push that down to each index. We'd have to take care not to let the property be changed independently on indexes, though. This feels a little grotty to me, but it does have the advantage of symmetry. Another way to get symmetry is to go the other way and add a new column pg_class.relusercatalog which gets used instead of putting user_catalog_table in the reloptions, and propagated down to indexes. But I have a feeling that the reloptions code is not very well-structured to allow reloptions to be stored any place but in pg_class.reloptions, so this may be difficult to implement. Yet a third way is to have the index fetch the flag from the associated table, perhaps when the relcache entry is built. But I see no existing precedent for that in RelationInitIndexAccessInfo, which I think is where it would be if we had it -- and that makes me suspect that there might be good reasons why this isn't actually safe. So while I do not really like the approach of storing the same property in different ways for tables and for indexes, it's also not really obvious to me how to do better. Regarding the new flags that have been added to various WAL records, I am a bit curious as to whether there's some way that we can avoid the need to carry this information through the WAL, but I don't understand why we don't need that now and do need that with this patch so it's hard for me to think about that question in an intelligent way. If we do need it, I think there might be cases where we should do something smarter than just adding bool onCatalogAccessibleInLogicalDecoding to the beginning of a whole bunch of WAL structs. In most cases we try to avoid having padding bytes in the WAL struct. If we can, we try to lay out the struct to avoid padding bytes. If we can't, we put the fields requiring less alignment at the end of the struct and then have a SizeOf<whatever> macro that is defined to not include the length of any trailing padding which the compiler would insert. See, for example, SizeOfHeapDelete. This patch doesn't do any of that, and it should. It should also consider whether there's a way to avoid adding any new bytes at all, e.g. it adds onCatalogAccessibleInLogicalDecoding to xl_heap_visible, but that struct has unused bits in 'flags'. It would be very helpful if there were some place to refer to that explained the design decisions here, like why the feature we're trying to get requires this infrastructure around indexes to be added. It could be in the commit messages, an email message, a README, or whatever, but right now I don't see it anywhere in here. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: