Re: Support EXCEPT for ALL SEQUENCES publications - Mailing list pgsql-hackers
| From | Shlok Kyal |
|---|---|
| Subject | Re: Support EXCEPT for ALL SEQUENCES publications |
| Date | |
| Msg-id | CANhcyEXiwG6HJiNp8eteOmGXnrSE+2eXRAv7Bh5i3QP7vt1nkg@mail.gmail.com Whole thread |
| In response to | Re: Support EXCEPT for ALL SEQUENCES publications (vignesh C <vignesh21@gmail.com>) |
| List | pgsql-hackers |
On Mon, 13 Apr 2026 at 16:21, vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 13 Apr 2026 at 15:46, shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Apr 13, 2026 at 9:58 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > > > Hi Shlok. > > > > > > Before this patch goes too far, I had a fundamental question. > > > > > > I understand that sequences and tables are closely related; sequences > > > are just like a single-row table, but they have a RELKIND_SEQUENCE. > > > > > > There is a lot of shared code internally, so I guess it is tempting to > > > specify both the published sequences and tables together in the System > > > Catalog 'pg_publication_rels'. > > > > > > I'm just wondering whether that is really the best way to go? > > > > > > Currently, pg_publication_rels has only tables. So they might be: > > > * only included tables -- from a publication using "FOR TABLE ..." > > > * only excluded tables -- from a publication using "FOR ALL TABLES > > > EXCEPT (TABLE ...)" > > > > > > Because included/excluded tables cannot co-exist, we can easily know > > > the type of the CREATE/ALTER PUBLICATION command and the type of > > > 'pg_publication_rels' content without digging deeper. > > > > > > ~~~ > > > > > > But introducing sequences introduces complexity. Now, AFAICT, we > > > cannot know what each row of 'pg_publication_rels' means without > > > inspecting the relation type of that row. e.g. now we have lots of > > > possible combinations like. > > > > > > pg_publication_rels has: > > > * only included tables. > > > * only excluded tables. > > > * only excluded sequences. > > > * excluded tables and excluded sequences. > > > * included tables and excluded sequences. > > > > > > Furthermore, there will be yet more combinations one day if the > > > individual "FOR SEQUENCE ..." syntax is implemented. > > > > > > pg_publication_rels has: > > > * only included sequences > > > * included sequences and included tables > > > * included sequences and excluded tables > > > > > > IIUC, it means that the SQL everywhere now requires joins and relkind > > > checks to identify the type. > > > > > > ~~ > > > > > > Furthermore, AFAICT, the 'pg_publication_rels' attributes 'prattrs' > > > and 'prquals' don't even have meaning for sequences. That's another > > > reason why it feels a bit like a square peg was jammed into a round > > > hole just because 'pg_publication_rels' was conveniently available. > > > > > > ~~ > > > > > > SUMMARY > > > > > > Given: > > > * Option 1 = use 'pg_publication_rels' for both tables and sequences. > > > * Option 2 = use 'pg_publication_rels' just for tables and use a new > > > 'pg_publication_seq' just for sequences. > > > > > > I'm not convinced that the chosen way (Option 1) is better. Can you > > > explain why it is? > > > > > > > pg_publication_seq seems like an idea worth considering. After giving > > it some thought, we already have a view named > > pg_publication_sequences, and adding another catalog with a similar > > name may introduce confusion. Apart from that, here are my thoughts: > > > > 1) > > pg_publication_rels already represents 'relations', and thus I feel > > sequences already fit naturally into this. > > > > 2) > > I agree that relkind checks may be required, but IMO these are > > typically cheap. In C code paths we already rely heavily on > > cache-lookups such as get_rel_relkind, and thus pg_class.relkind > > access is well-optimized. This is already used widely across the > > codebase, so IMO, extending it here does not introduce meaningful > > overhead. > > > > 3) > > Looking at the code paths accessing pg_publication_rels, they broadly > > fall into 4 categories: > > > > a) Publication DDL (CREATE/ALTER/DROP PUBLICATION) > > b) Tools and queries (e.g., pg_publication_tables view, pg_dump) > > c) Logical decoding (pgoutput paths like get_rel_sync_entry, > > pgoutput_column_list_init, pgoutput_row_filter_init) > > d) DML paths (INSERT/UPDATE) to check publication membership and RI > > applicability (CheckCmdReplicaIdentity) > > > > The first two are infrequent operations. For the logical decoding > > path, relation information is cached in RelationSyncCache, so relkind > > checks are not repeatedly performed in the hot path. For INSERT/UPDATE > > paths, PublicationDesc (rd_pubdesc) is built once per relation and is > > cached and reused, so pg_publication_rels is not accessed heavily here > > too. > > > > ~~ > > > > Considering all the above, I feel we should be good with Option 1. > > But, I would like to see what others think on this. > > +1 for option 1 and checking relkind for the same reason mentioned by > shveta above. It seems like a reasonable tradeoff compared to > introducing and maintaining a separate pg_publication_seq system > table. > Thanks Peter, Shveta, and Vignesh for the analysis. I agree with the points shared by Shveta and Vignesh and have verified them. Based on the reasoning, I think Option 1 is better. I will retain this design and use the pg_publication_rel to store sequences. Thanks, Shlok Kyal
pgsql-hackers by date: