Thread: Add lookup table for replication slot invalidation causes
Hi, Presently, replication slot invalidation causes and their text are scattered into ReplicationSlotInvalidationCause enum and a bunch of macros. This is making the code to get invalidation cause text given the cause as enum and vice-versa unreadable, longer and inextensible. The attached patch adds a lookup table for all invalidation causes for better readability and extensibility. FWIW, another patch in discussion https://www.postgresql.org/message-id/CALj2ACWgACB4opnbqi=x7Hc4aqcgkXoLsh1VB+gfidXaDQNu_Q@mail.gmail.com adds a couple of other invalidation reasons, this lookup table makes the life easier and code shorter. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Tue, 20 Feb 2024 at 12:11, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > Thoughts? Seems like a good improvement overall. But I'd prefer the definition of the lookup table to use this syntax: const char *const SlotInvalidationCauses[] = { [RS_INVAL_NONE] = "none", [RS_INVAL_WAL_REMOVED] = "wal_removed", [RS_INVAL_HORIZON] = "rows_removed", [RS_INVAL_WAL_LEVEL] = "wal_level_sufficient", }; Regarding the actual patch: - Assert(conflict_reason); Probably we should keep this Assert. As well as the Assert(0) + for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++) Strictly speaking this is a slight change in behaviour, since now "none" is also parsed. That seems fine to me though.
On Tue, Feb 20, 2024 at 05:53:03PM +0100, Jelte Fennema-Nio wrote: > Seems like a good improvement overall. But I'd prefer the definition > of the lookup table to use this syntax: > > const char *const SlotInvalidationCauses[] = { > [RS_INVAL_NONE] = "none", > [RS_INVAL_WAL_REMOVED] = "wal_removed", > [RS_INVAL_HORIZON] = "rows_removed", > [RS_INVAL_WAL_LEVEL] = "wal_level_sufficient", > }; +1. > Regarding the actual patch: > > - Assert(conflict_reason); > > Probably we should keep this Assert. As well as the Assert(0) The assert(0) at the end of the routine, likely so. I don't see a huge point for the assert on conflict_reason as we'd crash anyway on strcmp, no? > + for (cause = RS_INVAL_NONE; cause <= RS_INVAL_MAX_CAUSES; cause++) > > Strictly speaking this is a slight change in behaviour, since now > "none" is also parsed. That seems fine to me though. Yep. This does not strike me as an issue. We only use GetSlotInvalidationCause() in synchronize_slots(), mapping to NULL in the case of "none". Agreed that this is an improvement. +/* Maximum number of invalidation causes */ +#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL There is no need to add that to slot.h: it is only used in slot.c. -- Michael
Attachment
On Wed, Feb 21, 2024 at 5:04 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Feb 20, 2024 at 05:53:03PM +0100, Jelte Fennema-Nio wrote: > > Seems like a good improvement overall. But I'd prefer the definition > > of the lookup table to use this syntax: > > > > const char *const SlotInvalidationCauses[] = { > > [RS_INVAL_NONE] = "none", > > [RS_INVAL_WAL_REMOVED] = "wal_removed", > > [RS_INVAL_HORIZON] = "rows_removed", > > [RS_INVAL_WAL_LEVEL] = "wal_level_sufficient", > > }; > > +1. Done that way. I'm fine with the designated initialization [1] that an ISO C99 compliant compiler offers. PostgreSQL installation guide https://www.postgresql.org/docs/current/install-requirements.html says that we need an at least C99-compliant ISO/ANSI C compiler. [1] https://open-std.org/JTC1/SC22/WG14/www/docs/n494.pdf https://en.cppreference.com/w/c/99 https://www.ibm.com/docs/en/zos/2.4.0?topic=initializers-designated-aggregate-types-c-only > > Regarding the actual patch: > > > > - Assert(conflict_reason); > > > > Probably we should keep this Assert. As well as the Assert(0) > > The assert(0) at the end of the routine, likely so. I don't see a > huge point for the assert on conflict_reason as we'd crash anyway on > strcmp, no? Right, but an assertion isn't a bad idea there as it can generate a backtrace as opposed to the crash generating just SEGV note (and perhaps a crash dump) in server logs. With these two asserts, the behavior (asserts on null and non-existent inputs) is the same as what GetSlotInvalidationCause has right now. > +/* Maximum number of invalidation causes */ > +#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL > > There is no need to add that to slot.h: it is only used in slot.c. Right, but it needs to be updated whenever a new cause is added to enum ReplicationSlotInvalidationCause. Therefore, I think it's better to be closer there in slot.h. Please see the attached v2 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Feb 21, 2024 at 09:49:37AM +0530, Bharath Rupireddy wrote: > On Wed, Feb 21, 2024 at 5:04 AM Michael Paquier <michael@paquier.xyz> wrote: > Done that way. I'm fine with the designated initialization [1] that an > ISO C99 compliant compiler offers. PostgreSQL installation guide > https://www.postgresql.org/docs/current/install-requirements.html says > that we need an at least C99-compliant ISO/ANSI C compiler. Note the recent commit 74a730631065 where Alvaro has changed for the lwlock tranche names. That's quite elegant. > Right, but an assertion isn't a bad idea there as it can generate a > backtrace as opposed to the crash generating just SEGV note (and > perhaps a crash dump) in server logs. > > With these two asserts, the behavior (asserts on null and non-existent > inputs) is the same as what GetSlotInvalidationCause has right now. Well, I won't fight you over that. >> +/* Maximum number of invalidation causes */ >> +#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL >>>> There is no need to add that to slot.h: it is only used in slot.c. > > Right, but it needs to be updated whenever a new cause is added to > enum ReplicationSlotInvalidationCause. Therefore, I think it's better > to be closer there in slot.h. A new cause would require an update of SlotInvalidationCause, so if you keep RS_INVAL_MAX_CAUSES close to it that's impossible to miss. IMO, it makes just more sense to keep that in slot.c because of the static assert as well. + * If you add a new invalidation cause here, remember to add its name in + * SlotInvalidationCauses in the same order as that of the cause. The order does not matter with the way v2 does things with SlotInvalidationCauses[], no? -- Michael
Attachment
On Wed, Feb 21, 2024 at 11:56 AM Michael Paquier <michael@paquier.xyz> wrote: > > Note the recent commit 74a730631065 where Alvaro has changed for the > lwlock tranche names. That's quite elegant. Yes, that's absolutely neat. FWIW, designated initializer syntax can be used in a few more places though. I'm not sure how much worth it will be but I'll see if I can quickly put up a patch for it. > > With these two asserts, the behavior (asserts on null and non-existent > > inputs) is the same as what GetSlotInvalidationCause has right now. > > Well, I won't fight you over that. Haha :) > A new cause would require an update of SlotInvalidationCause, so if > you keep RS_INVAL_MAX_CAUSES close to it that's impossible to miss. > IMO, it makes just more sense to keep that in slot.c because of the > static assert as well. Hm, okay. Moved that to slot.c but left a note in the comment atop enum to update it. > + * If you add a new invalidation cause here, remember to add its name in > + * SlotInvalidationCauses in the same order as that of the cause. > > The order does not matter with the way v2 does things with > SlotInvalidationCauses[], no? Ugh. Corrected that now. Please see the attached v3 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Feb 21, 2024 at 12:50:00PM +0530, Bharath Rupireddy wrote: > Please see the attached v3 patch. Seems globally OK, so applied. I've simplified a bit the comments, painted some extra const, and kept variable name as conflict_reason as the other routines of slot.h use "name" already to refer to the slot names, and that was a bit confusing IMO. -- Michael
Attachment
Hi, Sorry for the late comment but isn't the pushed logic now different to what it was there before? IIUC previously (in a non-debug build) if the specified conflict_reason was not found, it returned RS_INVAL_NONE -- now it seems to return whatever enum happens to be last. How about something more like below: ---------- ReplicationSlotInvalidationCause GetSlotInvalidationCause(const char *conflict_reason) { ReplicationSlotInvalidationCause cause; bool found = false; for (cause = 0; !found && cause <= RS_INVAL_MAX_CAUSES; cause++) found = strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0; Assert(found); return found ? cause : RS_INVAL_NONE; } ---------- Kind Regards, Peter Smith. Fujitsu Australia
On Thu, Feb 22, 2024 at 5:19 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Hi, Sorry for the late comment but isn't the pushed logic now > different to what it was there before? > > IIUC previously (in a non-debug build) if the specified > conflict_reason was not found, it returned RS_INVAL_NONE -- now it > seems to return whatever enum happens to be last. > > How about something more like below: > > ---------- > ReplicationSlotInvalidationCause > GetSlotInvalidationCause(const char *conflict_reason) > { > ReplicationSlotInvalidationCause cause; > bool found = false; > > for (cause = 0; !found && cause <= RS_INVAL_MAX_CAUSES; cause++) > found = strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0; > > Assert(found); > return found ? cause : RS_INVAL_NONE; > } > ---------- > Oops. Perhaps I meant more like below -- in any case, the point was the same -- to ensure RS_INVAL_NONE is what returns if something unexpected happens. ReplicationSlotInvalidationCause GetSlotInvalidationCause(const char *conflict_reason) { ReplicationSlotInvalidationCause cause; for (cause = 0; cause <= RS_INVAL_MAX_CAUSES; cause++) { if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0) return cause; } Assert(0); return RS_INVAL_NONE; } ---------- Kind Regards, Peter Smith. Fujitsu Australia
On Thu, Feb 22, 2024 at 05:30:08PM +1100, Peter Smith wrote: > Oops. Perhaps I meant more like below -- in any case, the point was > the same -- to ensure RS_INVAL_NONE is what returns if something > unexpected happens. You are right that this could be a bit confusing, even if we should never reach this state. How about avoiding to return the index of the loop as result, as of the attached? Would you find that cleaner? -- Michael
Attachment
On Thu, Feb 22, 2024 at 12:26 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Feb 22, 2024 at 05:30:08PM +1100, Peter Smith wrote: > > Oops. Perhaps I meant more like below -- in any case, the point was > > the same -- to ensure RS_INVAL_NONE is what returns if something > > unexpected happens. > > You are right that this could be a bit confusing, even if we should > never reach this state. How about avoiding to return the index of the > loop as result, as of the attached? Would you find that cleaner? Looks neat! -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Thu, Feb 22, 2024 at 5:56 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Feb 22, 2024 at 05:30:08PM +1100, Peter Smith wrote: > > Oops. Perhaps I meant more like below -- in any case, the point was > > the same -- to ensure RS_INVAL_NONE is what returns if something > > unexpected happens. > > You are right that this could be a bit confusing, even if we should > never reach this state. How about avoiding to return the index of the > loop as result, as of the attached? Would you find that cleaner? > -- Hi, yes, it should never happen, but thanks for making the changes. I would've just removed every local variable instead of adding more of them. I also felt the iteration starting from RS_INVAL_NONE instead of 0 is asserting RS_INVAL_NONE must always be the first enum and can't be rearranged. Probably it will never happen, but why require it? ------ ReplicationSlotInvalidationCause GetSlotInvalidationCause(const char *conflict_reason) { for (ReplicationSlotInvalidationCause cause = 0; cause <= RS_INVAL_MAX_CAUSES; cause++) if (strcmp(SlotInvalidationCauses[cause], conflict_reason) == 0) return cause; Assert(0); return RS_INVAL_NONE; } ------ But maybe those nits are a matter of personal choice. Your patch code addressed my main concern, so it LGTM. ---------- Kind Regards, Peter Smith. Fujitsu Australia
On Fri, Feb 23, 2024 at 09:04:04AM +1100, Peter Smith wrote: > I would've just removed every local variable instead of adding more of > them. I also felt the iteration starting from RS_INVAL_NONE instead of > 0 is asserting RS_INVAL_NONE must always be the first enum and can't > be rearranged. Probably it will never happen, but why require it? FWIW, I think that the code is OK as-is, so I'd just let it be for now. -- Michael