Thread: Extensible Rmgr for Table AMs
Motivation: I'm working on a columnar compression AM[0]. Currently, it uses generic xlog, which works for crash recovery and physical replication, but not logical decoding/replication. Extensible rmgr would enable the table AM to support its own redo/decode hooks and WAL format, so that it could support crash recovery, physical replication, and logical replication. Background: I submitted another patch[0] to add new logical records, which could be used to support logical decoding directly, without the need for extensible rmgr and without any assumptions about the table AM. This was designed to be easy to use, but inefficient. Amit raised concerns[1] about whether it could meet the needs of zheap. Andres suggested (off-list) that it would be better to just tackle the extensible rmgr problem. The idea for extensible rmgr has been proposed before[3]. The biggest argument against it seemed to be that there was no complete use case[4], so the worry was that something would be left out. Columnar is complete enough that I think it qualifies as a good use case. A subsequent proposal[5] was shot down because of a (potential?) need for catalog access[6]. The attached patch does not use the catalog; instead, it relies on table AM authors choosing IDs that don't conflict with each other. This seems like a reasonable answer, considering that there will likely be very few table AMs that go far enough to fully support WAL including decoding. Are there any other major arguments/objections that I missed? Proposal: The attached patch (against v14, so it's easier to test columnar) is somewhat like a simplified version of [3] combined with refactoring to make decoding a part of the rmgr. * adds a new RmgrData method rm_decode * refactors decode.c to use the new method * add a layer of indirection GetRmgr to find an rmgr * fast path to find builtin rmgr in RmgrTable * to find a custom rmgr, traverses list of custom rmgrs that are currently loaded (unlikely to ever be more than a few) * rmgr IDs from 0-127 are reserved for builtin rmgrs * rmgr IDs from 128-255 are reserved for custom rmgrs * table AM authors need to avoid collisions between rmgr IDs I have tested with columnar using a simple WAL format for logical decoding only, and I'm still using generic xlog for recovery and physical replication. I haven't tested the redo path, or how easy it might be to do something like generic xlog. Questions: 0. Do we want to go this route, or something simpler like my other proposal, which introduces new logical record types[0]? 1. I am allocating the custom rmgr list in TopMemoryContext, and it only works when loading as a part of shared_preload_libraries. This avoids the need for shared memory in Simon's patch[3]. Is that the right thing to do? 2. If we go this route, what do we do with generic xlog? It seems like a half feature, since it doesn't work with logical decoding. 3. If the custom rmgr throws an error during redo, the server won't start. Should we have a GUC to turn non-builtin redo into a no-op to reduce the impact of bugs in the implementation of a custom rmgr? 4. Do we want to encourage index AMs to use this mechanism as well? I didn't really look into how suitable it is, but at a high level it seems reasonable. Regards, Jeff Davis [0] https://postgr.es/m/20ee0b0ae6958804a88fe9580157587720faf664.camel@j-davis.com [1] https://postgr.es/m/CAA4eK1JVDnbQ80ULdZuhzQkzr_yYhfON-tg%3Dd1U7aWjK_R1ixQ%40mail.gmail.com [2] https://github.com/citusdata/citus/tree/master/src/backend/columnar [3] https://postgr.es/m/1229541840.4793.79.camel%40ebony.2ndQuadrant [4] https://postgr.es/m/20992.1232667957%40sss.pgh.pa.us [5] https://postgr.es/m/1266774840.7341.29872.camel%40ebony [6] https://postgr.es/m/26134.1266776040%40sss.pgh.pa.us
Attachment
On Mon, 2021-11-08 at 15:36 -0800, Jeff Davis wrote: > The attached patch (against v14, so it's easier to test columnar) is > somewhat like a simplified version of [3] combined with refactoring > to > make decoding a part of the rmgr. I created a wiki page here: https://wiki.postgresql.org/wiki/ExtensibleRmgr To coordinate reservation of RmgrIds, to avoid conflicts. I don't expect it to be a practical problem given how much work it takes to create a new table AM that needs full WAL support, but might as well have some transparency on how to choose a new RmgrId. I also updated the patch to point to the wiki page in the comments, and added in a new RM_EXPERIMENTAL_ID that can be used while an extension is still in development. Hopefully this will prevent people reserving lots of RmgrIds for extensions that never get released. Regards, Jeff Davis
Attachment
On Mon, Nov 8, 2021 at 6:36 PM Jeff Davis <pgsql@j-davis.com> wrote: > Extensible rmgr would enable the table AM to support its own > redo/decode hooks and WAL format, so that it could support crash > recovery, physical replication, and logical replication. Without taking a position on your implementation, which I have not studied, I like this idea in concept and I think it's an important goal. > Are there any other major arguments/objections that I missed? ISTR some discussion of the fact that uninstalling the extension that uses this facility, or failing to install it on your standby, will lead to an unusable database. Personally, I don't see that as a big problem: we should just document that if you choose to use an extension like this, then (1) it needs to be installed on all standbys and (2) if you ever want to get rid of it, you need to stop using it, drop all the objects created with it, and then wait until all the WAL previously generated by that extension is gone not only from pg_wal but from any archived WAL files or backups that you intend to use with that cluster before you actually nuke it. Users who don't want to abide by those restrictions need not install such extensions. Users who don't read the documentation might end up sad, but it doesn't seem particularly likely, and it's hardly the only part of the documentation that users shouldn't ignore. -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, 2021-11-08 at 15:36 -0800, Jeff Davis wrote: > The attached patch (against v14, so it's easier to test columnar) is > somewhat like a simplified version of [3] combined with refactoring > to > make decoding a part of the rmgr. New patches attached (v3). Essentially the content as v2, but split into 3 patches and rebased. Review on patch 0001 would be helpful. It makes decoding just another method of rmgr, which makes a lot of sense to me from a code organization standpoint regardless of the other patches. Is there any reason not to do that? The other patches then make rmgr extensible, which in turn makes decoding extensible and solves the logical replication problem for custom table AMs. The most obvious way to make rmgr extensible would be to expand the rmgr table, but I was concerned about making that dynamic (right now the structure is entirely constant and perhaps that's important for some optimizations?). So, at the cost of complexity I made a separate, dynamic rmgr table to hold the custom entries. The custom rmgr API would probably be marked "experimental" for a while, and I don't expect lots of people to use it given the challenges of a production-quality table AM. But it's still important, because without it a table AM has no chance to participate in logical replication. Regards, Jeff Davis
Attachment
Hi, On Mon, Jan 17, 2022 at 12:42:06AM -0800, Jeff Davis wrote: > On Mon, 2021-11-08 at 15:36 -0800, Jeff Davis wrote: > > The attached patch (against v14, so it's easier to test columnar) is > > somewhat like a simplified version of [3] combined with refactoring > > to > > make decoding a part of the rmgr. > > New patches attached (v3). Essentially the content as v2, but split > into 3 patches and rebased. > > Review on patch 0001 would be helpful. It makes decoding just another > method of rmgr, which makes a lot of sense to me from a code > organization standpoint regardless of the other patches. Is there any > reason not to do that? I think that it's a clean and sensible approach, so +1 for me. There's a bit of 0003 present in 002: diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 35029cf97d..612b1b3723 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -738,7 +738,7 @@ ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr, (uint32) SizeOfXLogRecord, record->xl_tot_len); return false; } - if (record->xl_rmid > RM_MAX_ID) + if (record->xl_rmid > RM_MAX_ID && record->xl_rmid < RM_CUSTOM_MIN_ID) { > > The other patches then make rmgr extensible, which in turn makes > decoding extensible and solves the logical replication problem for > custom table AMs. The most obvious way to make rmgr extensible would be > to expand the rmgr table, but I was concerned about making that dynamic > (right now the structure is entirely constant and perhaps that's > important for some optimizations?). So, at the cost of complexity I > made a separate, dynamic rmgr table to hold the custom entries. I'm not sure about performance overhead, but allowing extension to access the main table directly seems a bit scary. We definitely accepted some overhead with the various extensible support, and this new GetRmgr() doesn't look like it's going to cost a lot for the builtin case, especially compared to the cost of any of the code associated with the rmgr. Also, to answer your initial email I think it's a better way to go compared to your previous approach, given the limitations and performance of the generic xlog infrastructure, and hopefully index AMs should be able to rely on this too. A few random comments on 0003: #define RM_MAX_ID (RM_NEXT_ID - 1) +#define RM_CUSTOM_MIN_ID 128 +#define RM_CUSTOM_MAX_ID UINT8_MAX It would be a good idea to add a StaticAssertStmt here to make sure that there's no overlap in the ranges. + +/* + * RmgrId to use for extensions that require an RmgrId, but are still in + * development and have not reserved their own unique RmgrId yet. See: + * https://wiki.postgresql.org/wiki/ExtensibleRmgr + */ +#define RM_EXPERIMENTAL_ID 128 I'm a bit dubious about the whole "register your ID in the Wiki" approach. It might not be a problem since there probably won't be hundreds of users, and I don't have any better suggestion since it has to be consistent across nodes. + elog(LOG, "registering customer rmgr \"%s\" with ID %d", + rmgr->rm_name, rmid); Should it be a DEBUG message instead? Also s/customer/custom/ +RmgrData +GetCustomRmgr(RmgrId rmid) +{ + if (rmid < RM_CUSTOM_MIN_ID || rmid > RM_CUSTOM_MAX_ID) + elog(PANIC, "custom rmgr id %d out of range", rmid); Should this be an assert? +#define GetBuiltinRmgr(rmid) RmgrTable[(rmid)] +#define GetRmgr(rmid) ((rmid < RM_CUSTOM_MIN_ID) ? \ + GetBuiltinRmgr(rmid) : GetCustomRmgr(rmid)) rmid should be protected in the macro. > The custom rmgr API would probably be marked "experimental" for a > while, and I don't expect lots of people to use it given the challenges > of a production-quality table AM. But it's still important, because > without it a table AM has no chance to participate in logical > replication. How you plan to mark it experimental?
On Tue, 2022-01-18 at 17:53 +0800, Julien Rouhaud wrote: > > Review on patch 0001 would be helpful. It makes decoding just > > another > > method of rmgr, which makes a lot of sense to me from a code > > organization standpoint regardless of the other patches. Is there > > any > > reason not to do that? > > I think that it's a clean and sensible approach, so +1 for me. Thank you, committed 0001. Other patches not committed yet. > There's a bit of 0003 present in 002: I just squashed 0002 and 0003 together. Not large enough to keep separate. > A few random comments on 0003: > > #define RM_MAX_ID (RM_NEXT_ID - 1) > +#define RM_CUSTOM_MIN_ID 128 > +#define RM_CUSTOM_MAX_ID UINT8_MAX > > It would be a good idea to add a StaticAssertStmt here to make sure > that > there's no overlap in the ranges. Done. > + > +/* > + * RmgrId to use for extensions that require an RmgrId, but are > still in > + * development and have not reserved their own unique RmgrId yet. > See: > + * https://wiki.postgresql.org/wiki/ExtensibleRmgr > + */ > +#define RM_EXPERIMENTAL_ID 128 > > I'm a bit dubious about the whole "register your ID in the Wiki" > approach. It > might not be a problem since there probably won't be hundreds of > users, and I > don't have any better suggestion since it has to be consistent across > nodes. Agree, but I don't see a better approach, either. I do some sanity checking, which should catch collisions when they happen. > + elog(LOG, "registering customer rmgr \"%s\" with ID %d", > + rmgr->rm_name, rmid); > > Should it be a DEBUG message instead? Also s/customer/custom/ It seems like a fairly important thing to have in the log. Only a couple extensions will ever encounter this message, and only at server start. Typo fixed. > +RmgrData > +GetCustomRmgr(RmgrId rmid) > +{ > + if (rmid < RM_CUSTOM_MIN_ID || rmid > RM_CUSTOM_MAX_ID) > + elog(PANIC, "custom rmgr id %d out of range", rmid); > > Should this be an assert? If we make it an Assert, then it won't be caught in production builds. > +#define GetBuiltinRmgr(rmid) RmgrTable[(rmid)] > +#define GetRmgr(rmid) ((rmid < RM_CUSTOM_MIN_ID) ? \ > + GetBuiltinRmgr(rmid) : GetCustomRmgr(rmid)) > > rmid should be protected in the macro. Done. > How you plan to mark it experimental? I suppose it doesn't need to be marked explicitly -- there are other APIs that change. For instance, the ProcessUtility_hook changed, and that's used much more widely. As long as we generally agree that some kind of custom rmgrs are the way to go, if we change the API or implementation around from version to version I can easily update my table access method. Regards, Jeff Davis
Attachment
Hi, On Mon, Jan 31, 2022 at 06:36:59PM -0800, Jeff Davis wrote: > On Tue, 2022-01-18 at 17:53 +0800, Julien Rouhaud wrote: > > > There's a bit of 0003 present in 002: > > I just squashed 0002 and 0003 together. Not large enough to keep > separate. Agreed. > > + elog(LOG, "registering customer rmgr \"%s\" with ID %d", > > + rmgr->rm_name, rmid); > > > > Should it be a DEBUG message instead? Also s/customer/custom/ > > It seems like a fairly important thing to have in the log. Only a > couple extensions will ever encounter this message, and only at server > start. Ok. > > +RmgrData > > +GetCustomRmgr(RmgrId rmid) > > +{ > > + if (rmid < RM_CUSTOM_MIN_ID || rmid > RM_CUSTOM_MAX_ID) > > + elog(PANIC, "custom rmgr id %d out of range", rmid); > > > > Should this be an assert? > > If we make it an Assert, then it won't be caught in production builds. Sure, but it seems something so fundamental that it should get done right during the development phase rather than spending cycles in optimized builds to check for it. But even if that happened it would get caught by the final "not found" PANIC anyway, and for end users I don't think that getting this error instead would make much difference. > > How you plan to mark it experimental? > > I suppose it doesn't need to be marked explicitly -- there are other > APIs that change. For instance, the ProcessUtility_hook changed, and > that's used much more widely. > As long as we generally agree that some kind of custom rmgrs are the > way to go, if we change the API or implementation around from version > to version I can easily update my table access method. Agreed, API breaks happen often and that's not really a problem for third-party extensions, especially if that comes with hard compile failure. Other than that the patch looks good to me, as you said we just need a decision on whether custom rmgrs are wanted or not.
On Tue, Feb 01, 2022 at 12:39:38PM +0800, Julien Rouhaud wrote: > > Other than that the patch looks good to me, as you said we just need a decision > on whether custom rmgrs are wanted or not. One last thing, did you do some benchmark with a couple custom rmgr to see how much the O(n) access is showing up in profiles?
On Tue, 2022-02-01 at 20:45 +0800, Julien Rouhaud wrote: > On Tue, Feb 01, 2022 at 12:39:38PM +0800, Julien Rouhaud wrote: > > > > Other than that the patch looks good to me, as you said we just > > need a decision > > on whether custom rmgrs are wanted or not. > > One last thing, did you do some benchmark with a couple custom rmgr > to see how > much the O(n) access is showing up in profiles? What kind of a test case would be reasonable there? You mean having a lot of custom rmgrs? I was expecting that few people would have more than one custom rmgr loaded anyway, so a sparse array or hashtable seemed wasteful. If custom rmgrs become popular we probably need to have a larger ID space anyway, but it seems like overengineering to do so now. Regards, Jeff Davis
Hi, On Tue, Feb 01, 2022 at 03:38:32PM -0800, Jeff Davis wrote: > On Tue, 2022-02-01 at 20:45 +0800, Julien Rouhaud wrote: > > > > One last thing, did you do some benchmark with a couple custom rmgr > > to see how > > much the O(n) access is showing up in profiles? > > What kind of a test case would be reasonable there? You mean having a > lot of custom rmgrs? > > I was expecting that few people would have more than one custom rmgr > loaded anyway, so a sparse array or hashtable seemed wasteful. If > custom rmgrs become popular we probably need to have a larger ID space > anyway, but it seems like overengineering to do so now. I agree that having dozen of custom rmgrs doesn't seem likely, but I also have no idea of how much overhead you get by not doing a direct array access. I think it would be informative to benchmark something like simple OLTP write workload on a fast storage (or a ramdisk, or with fsync off...), with the used rmgr being the 1st and the 2nd custom rmgr. Both scenario still seems plausible and shouldn't degenerate on good hardware.
On Thu, Feb 3, 2022 at 12:34 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > I agree that having dozen of custom rmgrs doesn't seem likely, but I also have > no idea of how much overhead you get by not doing a direct array access. I > think it would be informative to benchmark something like simple OLTP write > workload on a fast storage (or a ramdisk, or with fsync off...), with the used > rmgr being the 1st and the 2nd custom rmgr. Both scenario still seems > plausible and shouldn't degenerate on good hardware. I think it would be hard to measure the overhead of this approach on a macrobenchmark. That having been said, I find this a surprising implementation choice. I think that the approaches that are most worth considering are: (1) reallocate the array if needed so that we can continue to just do RmgrTable[rmid] (2) have one array for builtins and a second array for extensions and do rmid < RM_CUSTOM_MIN_ID ? BuiltinRmgrTable[rmid] : ExtensionRmgrTable[rmid] (3) change RmgrTable to be an array of pointers to structs rather than an an array of structs. then the structs don't move around and can be const, but the pointers can be moved into a larger array if required I'm not really sure which is best. My intuition for what will be cheapest on modern hardware is pretty shaky. However, I can't see how it can be the thing the patch is doing now; a linear search seems like it has to be the slowest option. -- Robert Haas EDB: http://www.enterprisedb.com
Hi, On Fri, Feb 04, 2022 at 09:10:42AM -0500, Robert Haas wrote: > On Thu, Feb 3, 2022 at 12:34 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > > I agree that having dozen of custom rmgrs doesn't seem likely, but I also have > > no idea of how much overhead you get by not doing a direct array access. I > > think it would be informative to benchmark something like simple OLTP write > > workload on a fast storage (or a ramdisk, or with fsync off...), with the used > > rmgr being the 1st and the 2nd custom rmgr. Both scenario still seems > > plausible and shouldn't degenerate on good hardware. > > I think it would be hard to measure the overhead of this approach on a > macrobenchmark. Yeah that's also my initial thought, but I wouldn't be terribly surprised to be wrong. > That having been said, I find this a surprising > implementation choice. I think that the approaches that are most worth > considering are: > > (1) reallocate the array if needed so that we can continue to just do > RmgrTable[rmid] > (2) have one array for builtins and a second array for extensions and > do rmid < RM_CUSTOM_MIN_ID ? BuiltinRmgrTable[rmid] : > ExtensionRmgrTable[rmid] > (3) change RmgrTable to be an array of pointers to structs rather than > an an array of structs. then the structs don't move around and can be > const, but the pointers can be moved into a larger array if required > > I'm not really sure which is best. My intuition for what will be > cheapest on modern hardware is pretty shaky. However, I can't see how > it can be the thing the patch is doing now; a linear search seems like > it has to be the slowest option. I guess the idea was to have a compromise between letting rmgr authors choose arbitrary ids to avoid any conflicts, especially with private implementations, without wasting too much memory. But those approaches would be pretty much incompatible with the current definition: +#define RM_CUSTOM_MIN_ID 128 +#define RM_CUSTOM_MAX_ID UINT8_MAX even if you only allocate up to the max id found, nothing guarantees that you won't get a quite high id.
On Fri, Feb 4, 2022 at 9:48 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > I guess the idea was to have a compromise between letting rmgr authors choose > arbitrary ids to avoid any conflicts, especially with private implementations, > without wasting too much memory. But those approaches would be pretty much > incompatible with the current definition: > > +#define RM_CUSTOM_MIN_ID 128 > +#define RM_CUSTOM_MAX_ID UINT8_MAX > > even if you only allocate up to the max id found, nothing guarantees that you > won't get a quite high id. Right, which I guess raises another question: if the maximum is UINT8_MAX, which BTW I find perfectly reasonable, why are we not just defining this as an array of size 256? There's no point in adding code complexity to save a few kB of memory. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Feb 04, 2022 at 09:53:09AM -0500, Robert Haas wrote: > On Fri, Feb 4, 2022 at 9:48 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > > I guess the idea was to have a compromise between letting rmgr authors choose > > arbitrary ids to avoid any conflicts, especially with private implementations, > > without wasting too much memory. But those approaches would be pretty much > > incompatible with the current definition: > > > > +#define RM_CUSTOM_MIN_ID 128 > > +#define RM_CUSTOM_MAX_ID UINT8_MAX > > > > even if you only allocate up to the max id found, nothing guarantees that you > > won't get a quite high id. > > Right, which I guess raises another question: if the maximum is > UINT8_MAX, which BTW I find perfectly reasonable, why are we not just > defining this as an array of size 256? There's no point in adding code > complexity to save a few kB of memory. Agreed, especially if combined with your suggested approach 3 (array of pointers).
On Fri, 2022-02-04 at 22:56 +0800, Julien Rouhaud wrote: > > Right, which I guess raises another question: if the maximum is > > UINT8_MAX, which BTW I find perfectly reasonable, why are we not > > just > > defining this as an array of size 256? There's no point in adding > > code > > complexity to save a few kB of memory. > > Agreed, especially if combined with your suggested approach 3 (array > of > pointers). Implemented and attached. I also updated pg_waldump and pg_rewind to do something reasonable. Additionally, I now have a reasonably complete implementation of a custom resource manager now: https://github.com/citusdata/citus/tree/custom-rmgr-15 (Not committed or intended to actually be used right now -- just a POC.) Offline, Andres mentioned that I should test recovery performance if we take your approach, because making the RmgrTable non-const could impact optimizations. Not sure if that would be a practical concern compared to all the other work done at REDO time. Regards, Jeff Davis
Attachment
Hi, On 2022-03-23 21:43:08 -0700, Jeff Davis wrote: > /* must be kept in sync with RmgrData definition in xlog_internal.h */ > #define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,mask,decode) \ > - { name, redo, desc, identify, startup, cleanup, mask, decode }, > + &(struct RmgrData){ name, redo, desc, identify, startup, cleanup, mask, decode }, > > -const RmgrData RmgrTable[RM_MAX_ID + 1] = { > +static RmgrData *RmgrTable[RM_MAX_ID + 1] = { > #include "access/rmgrlist.h" > }; I think this has been discussed before, but to me it's not obvious that it's a good idea to change RmgrTable from RmgrData to RmgrData *. That adds an indirection, without obvious benefit. > + > +/* > + * Start up all resource managers. > + */ > +void > +StartupResourceManagers() (void) > +void > +RegisterCustomRmgr(RmgrId rmid, RmgrData *rmgr) > +{ > + if (rmid < RM_MIN_CUSTOM_ID) > + ereport(PANIC, errmsg("custom rmgr id %d is out of range", rmid)); > + > + if (!process_shared_preload_libraries_in_progress) > + ereport(ERROR, > + (errmsg("custom rmgr must be registered while initializing modules in shared_preload_libraries"))); > + > + ereport(LOG, > + (errmsg("registering custom rmgr \"%s\" with ID %d", > + rmgr->rm_name, rmid))); > + > + if (RmgrTable[rmid] != NULL) > + ereport(PANIC, > + (errmsg("custom rmgr ID %d already registered with name \"%s\"", > + rmid, RmgrTable[rmid]->rm_name))); > + > + /* check for existing rmgr with the same name */ > + for (int i = 0; i <= RM_MAX_ID; i++) > + { > + const RmgrData *existing_rmgr = RmgrTable[i]; > + > + if (existing_rmgr == NULL) > + continue; > + > + if (!strcmp(existing_rmgr->rm_name, rmgr->rm_name)) > + ereport(PANIC, > + (errmsg("custom rmgr \"%s\" has the same name as builtin rmgr", > + existing_rmgr->rm_name))); > + } > + > + /* register it */ > + RmgrTable[rmid] = rmgr; > +} Random idea: Might be worth emitting the id->name mapping just after a redo location is determined, to make it easier to debug things. > +RmgrData * > +GetRmgr(RmgrId rmid) > +{ > + return RmgrTable[rmid]; > +} Given this is so simple, why incur the cost of a function call? Rather than continuing to expose RmgrTable and move GetRmgr() into the header, as a static inline? As-is this also prevent the compiler from optimizing across repeated GetRmgr() calls (which often won't be possible anyway, but still).. > - if (record->xl_rmid > RM_MAX_ID) > + if (record->xl_rmid > RM_MAX_BUILTIN_ID && record->xl_rmid < RM_MIN_CUSTOM_ID) > { > report_invalid_record(state, > "invalid resource manager ID %u at %X/%X", Shouldn't this continue to enforce RM_MAX_ID as well? > @@ -1604,12 +1603,7 @@ PerformWalRecovery(void) > > InRedo = true; > > - /* Initialize resource managers */ > - for (rmid = 0; rmid <= RM_MAX_ID; rmid++) > - { > - if (RmgrTable[rmid].rm_startup != NULL) > - RmgrTable[rmid].rm_startup(); > - } > + StartupResourceManagers(); Personally I'd rather name it ResourceManagersStartup() or RmgrStartup(). > @@ -1871,7 +1860,7 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *repl > xlogrecovery_redo(xlogreader, *replayTLI); > > /* Now apply the WAL record itself */ > - RmgrTable[record->xl_rmid].rm_redo(xlogreader); > + GetRmgr(record->xl_rmid)->rm_redo(xlogreader); > > /* > * After redo, check whether the backup pages associated with the WAL So we have just added one indirect call and one pointer indirection (previously RmgrTable could be resolved by the linker, now it needs to be dereferenced again), that's not too bad. I was afraid there'd be multiple calls. > @@ -2101,16 +2090,16 @@ xlog_outdesc(StringInfo buf, XLogReaderState *record) > uint8 info = XLogRecGetInfo(record); > const char *id; > > - appendStringInfoString(buf, RmgrTable[rmid].rm_name); > + appendStringInfoString(buf, GetRmgr(rmid)->rm_name); > appendStringInfoChar(buf, '/'); > > - id = RmgrTable[rmid].rm_identify(info); > + id = GetRmgr(rmid)->rm_identify(info); > if (id == NULL) > appendStringInfo(buf, "UNKNOWN (%X): ", info & ~XLR_INFO_MASK); > else > appendStringInfo(buf, "%s: ", id); > > - RmgrTable[rmid].rm_desc(buf, record); > + GetRmgr(rmid)->rm_desc(buf, record); > } Like here. It's obviously not as performance critical as replay. But it's still a shame to add 3 calls to GetRmgr, that each then need to dereference RmgrTable. The compiler won't be able to optimize any of that away. > @@ -117,8 +117,8 @@ LogicalDecodingProcessRecord(LogicalDecodingContext *ctx, XLogReaderState *recor > > rmid = XLogRecGetRmid(record); > > - if (RmgrTable[rmid].rm_decode != NULL) > - RmgrTable[rmid].rm_decode(ctx, &buf); > + if (GetRmgr(rmid)->rm_decode != NULL) > + GetRmgr(rmid)->rm_decode(ctx, &buf); > else > { > /* just deal with xid, and done */ This one might actually matter, overhead wise. > @@ -473,7 +473,7 @@ static void > XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record) > { > const char *id; > - const RmgrDescData *desc = &RmgrDescTable[XLogRecGetRmid(record)]; > + const RmgrDescData *desc = GetRmgrDesc(XLogRecGetRmid(record)); > uint32 rec_len; > uint32 fpi_len; > RelFileNode rnode; > @@ -658,7 +658,7 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats) > * calculate column totals. > */ > > - for (ri = 0; ri < RM_NEXT_ID; ri++) > + for (ri = 0; ri < RM_MAX_ID; ri++) > { > total_count += stats->rmgr_stats[ri].count; > total_rec_len += stats->rmgr_stats[ri].rec_len; > @@ -694,6 +694,9 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats) > fpi_len = stats->rmgr_stats[ri].fpi_len; > tot_len = rec_len + fpi_len; > > + if (ri > RM_MAX_BUILTIN_ID && count == 0) > + continue; > + Ah, I see. I had written a concerned comment about the previous hunk... > XLogDumpStatsRow(desc->rm_name, > count, total_count, rec_len, total_rec_len, > fpi_len, total_fpi_len, tot_len, total_len); > @@ -913,16 +916,16 @@ main(int argc, char **argv) > exit(EXIT_SUCCESS); > } > > - for (i = 0; i <= RM_MAX_ID; i++) > + for (i = 0; i <= RM_MAX_BUILTIN_ID; i++) > { > - if (pg_strcasecmp(optarg, RmgrDescTable[i].rm_name) == 0) > + if (pg_strcasecmp(optarg, GetRmgrDesc(i)->rm_name) == 0) > { > config.filter_by_rmgr[i] = true; > config.filter_by_rmgr_enabled = true; > break; > } > } > - if (i > RM_MAX_ID) > + if (i > RM_MAX_BUILTIN_ID) > { > pg_log_error("resource manager \"%s\" does not exist", > optarg); So we can't filter by rmgr id for non-builtin rmgr's? That sucks. Maybe add a custom:<i> syntax? Or generally allow numerical identifiers in addition to the names? Greetings, Andres Freund
On Mon, 2022-03-28 at 11:00 -0700, Andres Freund wrote: > I think this has been discussed before, but to me it's not obvious > that it's a > good idea to change RmgrTable from RmgrData to RmgrData *. That adds > an > indirection, without obvious benefit. I did some performance tests. I created a narrow table, took a base backup, loaded 100M records, finished the base backup. Then I recovered using the different build combinations (patched/unpatched, clang/gcc). compiler run1 run2 unpatched: gcc 102s 106s patched: gcc 107s 105s unpatched: clang 109s 110s patched: clang 109s 111s I don't see a clear signal that this patch worsens performance. The 102s run was the very first run, so I suspect it was just due to the processor starting out cold. Let me know if you think the test is testing the right paths. Perhaps I should address your other perf concerns around GetRmgr (make it static inline, reduce number of calls), and then leave the indirection for the sake of cleanliness? If you are still concerned, I can switch back to separate tables to eliminate the indirection for built-in rmgrs. Separate rmgr tables still require a branch (to decide which table to access), but it should be a highly predictable one. Regards, Jeff Davis
On Mon, 2022-03-28 at 11:00 -0700, Andres Freund wrote: > > +/* > > + * Start up all resource managers. > > + */ > > +void > > +StartupResourceManagers() > > (void) Fixed. > Random idea: Might be worth emitting the id->name mapping just after > a redo > location is determined, to make it easier to debug things. Not quite sure I understood this idea, do you mean dump all rmgrs and the IDs when performing recovery? > > +RmgrData * > > +GetRmgr(RmgrId rmid) > > +{ > > + return RmgrTable[rmid]; > > +} > > Given this is so simple, why incur the cost of a function call? > Rather than > continuing to expose RmgrTable and move GetRmgr() into the header, as > a static > inline? Made it static inline. > Shouldn't this continue to enforce RM_MAX_ID as well? Done. > Personally I'd rather name it ResourceManagersStartup() or > RmgrStartup(). Done. > Like here. It's obviously not as performance critical as replay. But > it's > still a shame to add 3 calls to GetRmgr, that each then need to > dereference > RmgrTable. The compiler won't be able to optimize any of that away. Changed to only call it once and save it in a variable for each call site where it makes sense. > So we can't filter by rmgr id for non-builtin rmgr's? That sucks. > Maybe add a > custom:<i> syntax? Or generally allow numerical identifiers in > addition to the > names? Good idea. I changed it to allow "custom###" to mean the custom rmgr with ID ### (3 digits). I still may change it to go back to two RmgrTables (one for builtin and one for custom) to remove the lingering performance doubts. Other than that, and some cleanup, this is pretty close to the version I intend to commit. Regards, Jeff Davis
Attachment
Hi, On 2022-03-31 14:20:51 -0700, Jeff Davis wrote: > If you are still concerned, I can switch back to separate tables to > eliminate the indirection for built-in rmgrs. Separate rmgr tables > still require a branch (to decide which table to access), but it should > be a highly predictable one. I still think the easiest and fastest would be to just make RmgrTable longer, and not const. When registering, copy the caller provided struct into the respective RmgrData element. Yes, we'd waste a bit of space at the end of the array, but it's typically not going to be touched and thus not be backed by "actual" memory. Greetings, Andres Freund
On Mon, Apr 4, 2022 at 8:33 AM Jeff Davis <pgsql@j-davis.com> wrote: > > I still may change it to go back to two RmgrTables (one for builtin and > one for custom) to remove the lingering performance doubts. Other than > that, and some cleanup, this is pretty close to the version I intend to > commit. I just had a quick look over the v6 patch, few comments: 1) Why can't rmid be chosen by the extensions in sequential order from (129 till 255), say, to start with a columnar extension can choose 129, another extension can register 130 and so on right? This way, the RmgrStartup, RmgrCleanup, and XLogDumpDisplayRecord can avoid looping over all the 256 entries, with a global variable like current_max_rmid(representing the current number of rmgers)? Is there any specific reason to allow them to choose rmgrid randomly between 129 till 255? Am I missing some discussion here? 2) RM_MAX_ID is now UINT8_MAX - do we need to make it configurable? or do we think that only a few (127) custom rmgrs can exist? 3) Do we need to talk about extensible rmgrs and https://wiki.postgresql.org/wiki/ExtensibleRmgr in documentation somewhere? This will help extension developers to refer when needed. 4) Do we need to change this description? <para> The default value of this setting is the empty string, which disables the feature. It can be set to <literal>all</literal> to check all records, or to a comma-separated list of resource managers to check only records originating from those resource managers. Currently, the supported resource managers are <literal>heap</literal>, <literal>heap2</literal>, <literal>btree</literal>, <literal>hash</literal>, <literal>gin</literal>, <literal>gist</literal>, <literal>sequence</literal>, <literal>spgist</literal>, <literal>brin</literal>, and <literal>generic</literal>. Onl superusers can change this setting. 5) Do we need to add a PGDLLIMPORT qualifier to RegisterCustomRmgr() (possibly for RmgrStartup, RmgrTable, RmgrCleanup as well?) so that the Windows versions of extensions can use this feature? 6) Should the below strcmp pg_strcmp? The custom rmgrs can still use rm_name with different cases and below code fail to catch it? + if (!strcmp(existing_rmgr->rm_name, rmgr->rm_name)) + ereport(PANIC, + (errmsg("custom rmgr \"%s\" has the same name as builtin rmgr", + existing_rmgr->rm_name))); 7) What's the intention of the below code? It seems like below it checks if there's already a rmgr with the given name (RM_MAX_ID). Had it been RM_MAX_BUILTIN_ID instead of RM_MAX_ID, the error message does make sense. Is the intention here to not have duplicate rmgrs in the entire RM_MAX_ID(both builtin and custom)? + /* check for existing rmgr with the same name */ + for (int i = 0; i <= RM_MAX_ID; i++) + { + const RmgrData *existing_rmgr = RmgrTable[i]; + + if (existing_rmgr == NULL) + continue; + + if (!strcmp(existing_rmgr->rm_name, rmgr->rm_name)) + ereport(PANIC, + (errmsg("custom rmgr \"%s\" has the same name as builtin rmgr", + existing_rmgr->rm_name))); 8) Per https://wiki.postgresql.org/wiki/ExtensibleRmgr, 128 is reserved, can we have it as a macro? Or something like (RM_MAX_ID/2+1) +#define RM_MIN_CUSTOM_ID 128 9) Thinking if there's a way to test the code, maybe exposing RegisterCustomRmgr as a function? I think we need to have a dummy extension that will be used for testing this kind of patches/features but that's a separate discussion IMO. Regards, Bharath Rupireddy.
On Sun, 2022-04-03 at 21:33 -0700, Andres Freund wrote: > I still think the easiest and fastest would be to just make RmgrTable > longer, > and not const. When registering, copy the caller provided struct into > the > respective RmgrData element. Yes, we'd waste a bit of space at the > end of the > array, but it's typically not going to be touched and thus not be > backed by > "actual" memory. Sounds good to me. I tried to break down the performance between these approaches and didn't get a clear signal, so I'll go with your intuition here. Posting new patch in response to Bharath's review, which will include this change. Note that GetRmgr() also has an unlikely branch where it tests for the validity of the rmgr before using it, so that it can throw a nice error message if someone forgot to include the module in shared_preload_libraries. I expect this will be highly predictable and therefore not a problem. Regards, Jeff Davis
On Mon, 2022-04-04 at 11:15 +0530, Bharath Rupireddy wrote: > 1) Why can't rmid be chosen by the extensions in sequential order > from > (129 till 255), say, to start with a columnar extension can choose > 129, another extension can register 130 and so on right? I'm not sure what you mean by "chosen by the extensions in sequential order". If you mean: (a) that it would depend on the library load order, and that postgres would assign the ID; then that won't work because the rmgr IDs need to be stable across restarts and config changes (b) that there should be a convention where people reserve IDs on the wiki page in sequential order; then that's fine with me > 2) RM_MAX_ID is now UINT8_MAX - do we need to make it configurable? > or > do we think that only a few (127) custom rmgrs can exist? 127 should be plenty for quite a long time. If we need more, we can come up with a different solution (which is OK because the WAL format changes in new major versions). > 3) Do we need to talk about extensible rmgrs and > https://wiki.postgresql.org/wiki/ExtensibleRmgr in documentation > somewhere? This will help extension developers to refer when needed. We don't have user-facing documentation for every extension hook, and I don't think it would be a good idea to document this one (at least in the user-facing docs). Otherwise, we'd have to document the whole resource manager interface, and that seems like way too much of the internals. > 4) Do we need to change this description? > <para> > The default value of this setting is the empty string, which > disables > the feature. It can be set to <literal>all</literal> to > check all > records, or to a comma-separated list of resource managers to > check > only records originating from those resource > managers. Currently, > the supported resource managers are <literal>heap</literal>, > <literal>heap2</literal>, <literal>btree</literal>, > <literal>hash</literal>, > <literal>gin</literal>, <literal>gist</literal>, > <literal>sequence</literal>, > <literal>spgist</literal>, <literal>brin</literal>, and > <literal>generic</literal>. Onl > superusers can change this setting. Yes, you're right, I updated the docs for pg_waldump and the wal_consistency_checking GUC. > 5) Do we need to add a PGDLLIMPORT qualifier to RegisterCustomRmgr() > (possibly for RmgrStartup, RmgrTable, RmgrCleanup as well?) so that > the Windows versions of extensions can use this feature? That seems to only be required for variables, not functions. I don't think we want to mark RmgrTable this way, because it's not intended to be accessed by extensions directly. It's accessed by GetRmgr(), which is 'static inline', but that's also not intended to be used by extensions. > 6) Should the below strcmp pg_strcmp? The custom rmgrs can still use > rm_name with different cases and below code fail to catch it? > + if (!strcmp(existing_rmgr->rm_name, rmgr->rm_name)) > + ereport(PANIC, > + (errmsg("custom rmgr \"%s\" has the same name as builtin rmgr", > + existing_rmgr->rm_name))); There are already a lot of places where users can choose identifiers that differ only in uppercase/lowercase. I don't see a reason to make an exception here. > 7) What's the intention of the below code? It seems like below it > checks if there's already a rmgr with the given name (RM_MAX_ID). Had > it been RM_MAX_BUILTIN_ID instead of RM_MAX_ID, the error message > does make sense. Is the intention here to not have duplicate rmgrs in > the entire RM_MAX_ID(both builtin and custom)? Thank you. I redid a lot of the error messages and checked them out to make sure they are helpful. > 8) Per https://wiki.postgresql.org/wiki/ExtensibleRmgr, 128 is > reserved, can we have it as a macro? Or something like > (RM_MAX_ID/2+1) It's already defined as RM_EXPERIMENTAL_ID. Is that what you meant or did I misunderstand? I don't see the point in defining it as RM_MAX_ID/2+1. > 9) Thinking if there's a way to test the code, maybe exposing > RegisterCustomRmgr as a function? I think we need to have a dummy > extension that will be used for testing this kind of patches/features > but that's a separate discussion IMO. It's already exposed as a C function in xlog_internal.h. You mean as a SQL function? I don't think that makes sense. It can only be called while processing shared_preload_libraries (on server startup) anyway. Other changes this version: * implemented Andres's proposed performance change[1] * fix wal_consistency_checking * general cleanup Regards, Jeff Davis [1] https://postgr.es/m/20220404043337.ocjnni7hknjsibhg@alap3.anarazel.de
Attachment
On Tue, Apr 5, 2022 at 5:55 AM Jeff Davis <pgsql@j-davis.com> wrote: > > On Mon, 2022-04-04 at 11:15 +0530, Bharath Rupireddy wrote: > > 1) Why can't rmid be chosen by the extensions in sequential order > > from > > (129 till 255), say, to start with a columnar extension can choose > > 129, another extension can register 130 and so on right? > > I'm not sure what you mean by "chosen by the extensions in sequential > order". If you mean: > > (b) that there should be a convention where people reserve IDs on the > wiki page in sequential order; then that's fine with me Yes, I meant this. If we do this, then a global variable current_max_rmid can be maintained (by RegisterCustomRmgr) and the other functions RmgrStartup, RmgrCleanup and RegisterCustomRmgr can avoid for-loops with full range RM_MAX_ID (for (int id = 0; id < current_max_rmid; id++)). > > 3) Do we need to talk about extensible rmgrs and > > https://wiki.postgresql.org/wiki/ExtensibleRmgr in documentation > > somewhere? This will help extension developers to refer when needed. > > We don't have user-facing documentation for every extension hook, and I > don't think it would be a good idea to document this one (at least in > the user-facing docs). Otherwise, we'd have to document the whole > resource manager interface, and that seems like way too much of the > internals. With the custom rmgrs, now it's time to have it in the documentation (probably not immediately, even after this patch gets in) describing what rmgrs are, what are built-in and custom rmgrs and how and when an extension needs to register, load and use rmgrs etc. > > 4) Do we need to change this description? > > <para> > > <literal>generic</literal>. Onl > > superusers can change this setting. > > Yes, you're right, I updated the docs for pg_waldump and the > wal_consistency_checking GUC. + <para> + Extensions may define additional resource managers with modules loaded + in <xref linkend="guc-shared-preload-libraries"/>. However, the names + of the custom resource managers are not available at the time the + errhint("Include the extension module that implements this resource manager in shared_preload_libraries."))); With the above, do you mean to say that only the extensions that load their library via shared_preload_libraries are allowed to have custom rmgrs? How about extensions using {session, local}_preload_libraries, LOAD command? How about extensions that don't load the shared library via {session, local, shared}_preload_libraries or LOAD command and let any of its functions load the shared library on first use? > > 5) Do we need to add a PGDLLIMPORT qualifier to RegisterCustomRmgr() > > (possibly for RmgrStartup, RmgrTable, RmgrCleanup as well?) so that > > the Windows versions of extensions can use this feature? > > That seems to only be required for variables, not functions. You are right. For instance, the functions DefineCustomXXX are not using PGDLLIMPORT qualifer, I believe they can be used by extensions on WINDOWS. > > 6) Should the below strcmp pg_strcmp? The custom rmgrs can still use > > rm_name with different cases and below code fail to catch it? > > + if (!strcmp(existing_rmgr->rm_name, rmgr->rm_name)) > > + ereport(PANIC, > > + (errmsg("custom rmgr \"%s\" has the same name as builtin rmgr", > > + existing_rmgr->rm_name))); > > There are already a lot of places where users can choose identifiers > that differ only in uppercase/lowercase. I don't see a reason to make > an exception here. I think postgres parses the user-specified strings and converts them to lowercase unless they are double-quoted, that is why we see strcmp, not pg_strcasecmp. But, the custom rmgr name isn't parsed by postgres right? For instance - extension 1, with id 130 and name 'rmgr_foo' and extension 2 with id 131 and name 'Rmgr_foo'/'RMGR_FOO'...., then the following code may not catch it right? Do you want rmgr names to be case-sensitive/insensitive? + if (!strcmp(RmgrTable[i].rm_name, rmgr->rm_name)) + ereport(PANIC, + (errmsg("failed to register custom resource manager \"%s\" with ID %d", rmgr->rm_name, rmid), + errdetail("Existing resource manager with ID %d has the same name.", i))); > > 8) Per https://wiki.postgresql.org/wiki/ExtensibleRmgr, 128 is > > reserved, can we have it as a macro? Or something like > > (RM_MAX_ID/2+1) > > It's already defined as RM_EXPERIMENTAL_ID. Is that what you meant or > did I misunderstand? +#define RM_MAX_ID UINT8_MAX +#define RM_MAX_BUILTIN_ID (RM_NEXT_ID - 1) +#define RM_MIN_CUSTOM_ID 128 +#define RM_N_CUSTOM_IDS (RM_MAX_ID - RM_MIN_CUSTOM_ID + 1) +#define RMID_IS_BUILTIN(rmid) ((rmid) <= RM_MAX_BUILTIN_ID) +#define RMID_IS_CUSTOM(rmid) ((rmid) >= RM_MIN_CUSTOM_ID && (rmid) < RM_MAX_ID) +#define RMID_IS_VALID(rmid) (RMID_IS_BUILTIN((rmid)) || RMID_IS_CUSTOM((rmid))) To me the above seems complex, why can't we just have the following, it's okay even if we don't use some of the macros in the *.c files? total rmgrs = 256 built in = 128 (index 0 - 127) custom = 128 (index 128 - 255) #define RM_N_IDS (UINT8_MAX + 1) #define RM_N_BUILTIN_IDS (RM_N_IDS / 2) #define RM_N_CUSTOM_IDS (RM_N_IDS / 2) #define RM_MAX_ID (RM_N_IDS - 1) #define RM_MAX_BUILTIN_ID (RM_NEXT_ID - 1) #define RM_MIN_CUSTOM_ID (RM_N_IDS / 2) #define RM_MAX_CUSTOM_ID RM_MAX_ID #define RMID_IS_BUILTIN(rmid) ((rmid) <= RM_MAX_BUILTIN_ID) #define RMID_IS_CUSTOM(rmid) ((rmid) >= RM_MIN_CUSTOM_ID && (rmid) <= RM_MAX_CUSTOM_ID) #define RMID_IS_VALID(rmid) (RMID_IS_BUILTIN((rmid)) || RMID_IS_CUSTOM((rmid))) If okay, you can specify in a comment that out of UINT8_MAX + 1 rmgrs, first half rmgr ids are supposed to be reserved for built in rmgrs and second half rmgr ids for extensions. > I don't see the point in defining it as RM_MAX_ID/2+1. > > > 9) Thinking if there's a way to test the code, maybe exposing > > RegisterCustomRmgr as a function? I think we need to have a dummy > > extension that will be used for testing this kind of patches/features > > but that's a separate discussion IMO. > > It's already exposed as a C function in xlog_internal.h. > > You mean as a SQL function? I don't think that makes sense. It can only > be called while processing shared_preload_libraries (on server startup) > anyway. Yes, an SQL function showing all of the available rmgrs. With the custom rmgrs, I think, an SQL function like pg_resource_managers() returning a set of {name, id, (if possible - loaded_by_extension_name, rm_redo, rm_desc, rm_identify ... api names)} Some more comments: 1) Is there any specific reason to have a function to emit a PANIC message? And it's being used only in one place in GetRmgr. +void +RmgrNotFound(RmgrId rmid) 2) How about "If rm_name is NULL, the corresponding RmgrTable[] entry is considered invalid."? + * RmgrTable[] is indexed by RmgrId values (see rmgrlist.h). If rm_name is + * NULL, the structure is considered invalid. 3) How about adding errdetail something like "Custom resource manager ID must be between %d and %d (both inclusive)", RM_MAX_BUILTIN_ID, RM_MAX_ID)? + if (rmid < RM_MIN_CUSTOM_ID) + ereport(PANIC, errmsg("custom resource manager ID %d is out of range", rmid)); 4) How about adding errdetail something like "Custom resource manager must have a name" + if (rmgr->rm_name == NULL) + ereport(PANIC, errmsg("custom resource manager is invalid")); 5) How about "successfully registered custom resource manager"? + (errmsg("registered custom resource manager \"%s\" with ID %d", 6) Shouldn't this be ri <= RM_NEXT_ID? - for (ri = 0; ri < RM_NEXT_ID; ri++) + for (ri = 0; ri < RM_MAX_ID; ri++) - for (ri = 0; ri < RM_NEXT_ID; ri++) + for (ri = 0; ri <= RM_MAX_ID; ri++) 7) Unrelated to this patch, why is there always an extra slot RM_MAX_ID + 1? -const RmgrDescData RmgrDescTable[RM_MAX_ID + 1] = { +static const RmgrDescData RmgrDescTable[RM_MAX_BUILTIN_ID + 1] = { #include "access/rmgrlist.h" -const RmgrData RmgrTable[RM_MAX_ID + 1] = { +RmgrData RmgrTable[RM_MAX_ID + 1] = { Regards, Bharath Rupireddy.
On Tue, 2022-04-05 at 15:12 +0530, Bharath Rupireddy wrote: > Yes, I meant this. If we do this, then a global variable > current_max_rmid can be maintained (by RegisterCustomRmgr) and the > other functions RmgrStartup, RmgrCleanup and RegisterCustomRmgr can > avoid for-loops with full range RM_MAX_ID (for (int id = 0; id < > current_max_rmid; id++)). There are only 255 entries, so the loops in those functions aren't a significant overhead. If we expand it, then we can come up a different solution. We can change it easily enough later, because the WAL format changes between releases and so do the APIs for any extension that might be using this. > With the custom rmgrs, now it's time to have it in the documentation > (probably not immediately, even after this patch gets in) describing > what rmgrs are, what are built-in and custom rmgrs and how and when > an > extension needs to register, load and use rmgrs etc. Added a documentation section right after Generic WAL. > With the above, do you mean to say that only the extensions that load > their library via shared_preload_libraries are allowed to have custom > rmgrs? How about extensions using {session, local}_preload_libraries, > LOAD command? How about extensions that don't load the shared library > via {session, local, shared}_preload_libraries or LOAD command and > let > any of its functions load the shared library on first use? Correct. The call to RegisterCustomRmgr *must* happen in the postmaster before any other processes (even the startup process) are launched. Otherwise recovery wouldn't work. The only place to make that happen is shared_preload_libraries. > For instance - extension 1, with id 130 and name 'rmgr_foo' and > extension 2 with id 131 and name 'Rmgr_foo'/'RMGR_FOO'...., then the > following code may not catch it right? Do you want rmgr names to be > case-sensitive/insensitive? You convinced me. pg_waldump.c uses pg_strcasecmp(), so I'll use it as well. > #define RM_N_IDS (UINT8_MAX + 1) > #define RM_N_BUILTIN_IDS (RM_N_IDS / 2) > #define RM_N_CUSTOM_IDS (RM_N_IDS / 2) > #define RM_MAX_ID (RM_N_IDS - 1) > #define RM_MAX_BUILTIN_ID (RM_NEXT_ID - 1) > #define RM_MIN_CUSTOM_ID (RM_N_IDS / 2) > #define RM_MAX_CUSTOM_ID RM_MAX_ID > #define RMID_IS_BUILTIN(rmid) ((rmid) <= RM_MAX_BUILTIN_ID) > #define RMID_IS_CUSTOM(rmid) ((rmid) >= RM_MIN_CUSTOM_ID && (rmid) <= > RM_MAX_CUSTOM_ID) > #define RMID_IS_VALID(rmid) (RMID_IS_BUILTIN((rmid)) || > RMID_IS_CUSTOM((rmid))) I improved this a bit. But I didn't use your version that's based on division, that was more confusing to me. > Yes, an SQL function showing all of the available rmgrs. With the > custom rmgrs, I think, an SQL function like pg_resource_managers() > returning a set of {name, id, (if possible - > loaded_by_extension_name, > rm_redo, rm_desc, rm_identify ... api names)} Added. > > Some more comments: > > 1) Is there any specific reason to have a function to emit a PANIC > message? And it's being used only in one place in GetRmgr. > +void > +RmgrNotFound(RmgrId rmid) Good call -- better to raise an error, and it can get escalated if it happens in the wrong place. Done. > 2) How about "If rm_name is NULL, the corresponding RmgrTable[] entry > is considered invalid."? > + * RmgrTable[] is indexed by RmgrId values (see rmgrlist.h). If > rm_name is > + * NULL, the structure is considered invalid. Done. > 3) How about adding errdetail something like "Custom resource manager > ID must be between %d and %d (both inclusive)", RM_MAX_BUILTIN_ID, > RM_MAX_ID)? > + if (rmid < RM_MIN_CUSTOM_ID) > + ereport(PANIC, errmsg("custom resource manager ID %d is out of > range", rmid)); Done. > 4) How about adding errdetail something like "Custom resource manager > must have a name" > + if (rmgr->rm_name == NULL) > + ereport(PANIC, errmsg("custom resource manager is invalid")); Improved this message. > 5) How about "successfully registered custom resource manager"? > + (errmsg("registered custom resource manager \"%s\" with ID %d", That's a little much, we don't want to sound too excited that it worked ;-) > 6) Shouldn't this be ri <= RM_NEXT_ID? > - for (ri = 0; ri < RM_NEXT_ID; ri++) > + for (ri = 0; ri < RM_MAX_ID; ri++) > > - for (ri = 0; ri < RM_NEXT_ID; ri++) > + for (ri = 0; ri <= RM_MAX_ID; ri++) That would leave out the custom rmgrs. > 7) Unrelated to this patch, why is there always an extra slot > RM_MAX_ID + 1? RM_MAX_ID is 255, but we need 256 entries from 0-255 (inclusive). Good suggestions, thank you for the review! I'm happy with this patch and I committed it. Changes from last version: 1. I came up with a better solution (based on a suggestion from Andres) to handle wal_consistency_checking properly. We can't understand the custom resource managers when the configuration file is first loaded, so if we enounter an unknown resource manager, I set a flag and reprocess it after the shared_preload_libraries are loaded. 2. Added pg_get_wal_resource_managers(). 3. Documentation. Regards, Jeff Davis
Hi, On 2022-04-06 23:15:27 -0700, Jeff Davis wrote: > I'm happy with this patch and I committed it. That caused breakage with WAL_DEBUG enabled. Fixed that. But it was half-broken before, at least since 70e81861fad, because 'rmid' didn't refer to the current record's rmid anymore, but to rmid from "Initialize resource managers" - a constant. Causes plenty new warnings here: In file included from /home/andres/src/postgresql/src/include/access/xlogrecord.h:14, from /home/andres/src/postgresql/src/include/access/xlogreader.h:41, from /home/andres/src/postgresql/src/include/access/brin_xlog.h:17, from /home/andres/src/postgresql/src/backend/access/transam/rmgr.c:10: /home/andres/src/postgresql/src/backend/access/transam/rmgr.c: In function ‘RegisterCustomRmgr’: /home/andres/src/postgresql/src/include/access/rmgr.h:42:66: warning: comparison is always true due to limited range of datatype [-Wtype-limits] 42 | (rmid) <= RM_MAX_CUSTOM_ID) | ^~ /home/andres/src/postgresql/src/backend/access/transam/rmgr.c:104:14: note: in expansion of macro ‘RMID_IS_CUSTOM’ 104 | if (!RMID_IS_CUSTOM(rmid)) | ^~~~~~~~~~~~~~ In file included from /home/andres/src/postgresql/src/include/c.h:814, from /home/andres/src/postgresql/src/include/postgres.h:46, from /home/andres/src/postgresql/src/bin/pg_waldump/rmgrdesc.c:9: /home/andres/src/postgresql/src/bin/pg_waldump/rmgrdesc.c: In function ‘GetRmgrDesc’: /home/andres/src/postgresql/src/include/access/rmgr.h:42:66: warning: comparison is always true due to limited range of datatype [-Wtype-limits] 42 | (rmid) <= RM_MAX_CUSTOM_ID) | ^~ /home/andres/src/postgresql/src/bin/pg_waldump/rmgrdesc.c:89:9: note: in expansion of macro ‘Assert’ 89 | Assert(RMID_IS_VALID(rmid)); | ^~~~~~ /home/andres/src/postgresql/src/include/access/rmgr.h:43:57: note: in expansion of macro ‘RMID_IS_CUSTOM’ 43 | #define RMID_IS_VALID(rmid) (RMID_IS_BUILTIN((rmid)) || RMID_IS_CUSTOM((rmid))) | ^~~~~~~~~~~~~~ /home/andres/src/postgresql/src/bin/pg_waldump/rmgrdesc.c:89:16: note: in expansion of macro ‘RMID_IS_VALID’ 89 | Assert(RMID_IS_VALID(rmid)); | ^~~~~~~~~~~~~ /home/andres/src/postgresql/src/include/access/rmgr.h:42:66: warning: comparison is always true due to limited range of datatype [-Wtype-limits] 42 | (rmid) <= RM_MAX_CUSTOM_ID) | ^~ /home/andres/src/postgresql/src/bin/pg_waldump/rmgrdesc.c:89:9: note: in expansion of macro ‘Assert’ 89 | Assert(RMID_IS_VALID(rmid)); | ^~~~~~ /home/andres/src/postgresql/src/include/access/rmgr.h:43:57: note: in expansion of macro ‘RMID_IS_CUSTOM’ 43 | #define RMID_IS_VALID(rmid) (RMID_IS_BUILTIN((rmid)) || RMID_IS_CUSTOM((rmid))) | ^~~~~~~~~~~~~~ /home/andres/src/postgresql/src/bin/pg_waldump/rmgrdesc.c:89:16: note: in expansion of macro ‘RMID_IS_VALID’ 89 | Assert(RMID_IS_VALID(rmid)); | ^~~~~~~~~~~~~ Greetings, Andres Freund
Hi, On 2022-04-06 23:35:05 -0700, Andres Freund wrote: > Causes plenty new warnings here: And my machine isn't alone. There's also: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2022-04-07%2006%3A40%3A14 rmgrdesc.c:44:1: error: missing braces around initializer [-Werror=missing-braces] rmgrdesc.c:44:1: error: (near initialization for 'CustomNumericNames[0]') [-Werror=missing-braces] rmgrdesc.c:45:1: error: missing braces around initializer [-Werror=missing-braces] rmgrdesc.c:45:1: error: (near initialization for 'CustomRmgrDesc[0]') [-Werror=missing-braces] cc1: all warnings being treated as errors Greetings, Andres Freund
Hi, On 2022-04-06 23:56:40 -0700, Andres Freund wrote: > On 2022-04-06 23:35:05 -0700, Andres Freund wrote: > > Causes plenty new warnings here: > > And my machine isn't alone. There's also: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lapwing&dt=2022-04-07%2006%3A40%3A14 > > rmgrdesc.c:44:1: error: missing braces around initializer [-Werror=missing-braces] > rmgrdesc.c:44:1: error: (near initialization for 'CustomNumericNames[0]') [-Werror=missing-braces] > rmgrdesc.c:45:1: error: missing braces around initializer [-Werror=missing-braces] > rmgrdesc.c:45:1: error: (near initialization for 'CustomRmgrDesc[0]') [-Werror=missing-braces] > cc1: all warnings being treated as errors (pushing a fix for this one in a few) There's also: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2022-04-07%2006%3A49%3A14 ccache /opt/developerstudio12.6/bin/cc -m64 -Xa -v -O findtimezone.o initdb.o localtime.o -L../../../src/port -L../../../src/common-L../../../src/fe_utils -lpgfeutils -L../../../src/common -lpgcommon -L../../../src/port -lpgport -L../../../src/interfaces/libpq-lpq -L/home/nm/sw/nopath/uuid-64/lib -L/home/nm/sw/nopath/openldap-64/lib -Wl,--as-needed-Wl,-R'/home/nm/farm/studio64v12_6/HEAD/inst/lib' -lpgcommon -lpgport -lxslt -lxml2 -lpam -lssl -lcrypto-lgssapi_krb5 -lz -lreadline -ltermcap -lnsl -lsocket -lm -o initdb Undefined first referenced symbol in file RmgrTable initdb.o IIRC sun studio is annoying because it emits references to global variables when they're mentioned in a static inline, even if that inline is unused. Greetings, Andres Freund
Hi, On 2022-04-07 00:45:34 -0700, Andres Freund wrote: > On 2022-04-06 23:56:40 -0700, Andres Freund wrote: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2022-04-07%2006%3A49%3A14 > > ccache /opt/developerstudio12.6/bin/cc -m64 -Xa -v -O findtimezone.o initdb.o localtime.o -L../../../src/port -L../../../src/common-L../../../src/fe_utils -lpgfeutils -L../../../src/common -lpgcommon -L../../../src/port -lpgport -L../../../src/interfaces/libpq-lpq -L/home/nm/sw/nopath/uuid-64/lib -L/home/nm/sw/nopath/openldap-64/lib -Wl,--as-needed-Wl,-R'/home/nm/farm/studio64v12_6/HEAD/inst/lib' -lpgcommon -lpgport -lxslt -lxml2 -lpam -lssl -lcrypto-lgssapi_krb5 -lz -lreadline -ltermcap -lnsl -lsocket -lm -o initdb > Undefined first referenced > symbol in file > RmgrTable initdb.o > > IIRC sun studio is annoying because it emits references to global variables > when they're mentioned in a static inline, even if that inline is unused. Looks like a #ifndef FRONTEND around the inlines should work? Or just go back to the macros, but without the RM_MAX_CUSTOM_ID check that's implied due to the type width anyway. Greetings, Andres Freund
On Thu, Apr 7, 2022 at 11:45 AM Jeff Davis <pgsql@j-davis.com> wrote: > I'm happy with this patch and I committed it. Changes from last > version: Hi Jeff, I think there's a typo [1] in pg_waldump.c, we might miss to take account of the last custom resource mgr stats. Please fix it. diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 7d92dcaf87..30ca7684bd 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -720,7 +720,7 @@ XLogDumpDisplayStats(XLogDumpConfig *config, XLogDumpStats *stats) * calculate column totals. */ - for (ri = 0; ri < RM_MAX_ID; ri++) + for (ri = 0; ri <= RM_MAX_ID; ri++) { total_count += stats->rmgr_stats[ri].count; total_rec_len += stats->rmgr_stats[ri].rec_len; Regards, Bharath Rupireddy.
Jeff Davis <pgsql@j-davis.com> writes: > I'm happy with this patch and I committed it. wrasse is less happy: ccache /opt/developerstudio12.6/bin/cc -m64 -Xa -v -O findtimezone.o initdb.o localtime.o -L../../../src/port -L../../../src/common-L../../../src/fe_utils -lpgfeutils -L../../../src/common -lpgcommon -L../../../src/port -lpgport -L../../../src/interfaces/libpq-lpq -L/home/nm/sw/nopath/uuid-64/lib -L/home/nm/sw/nopath/openldap-64/lib -Wl,--as-needed-Wl,-R'/home/nm/farm/studio64v12_6/HEAD/inst/lib' -lpgcommon -lpgport -lxslt -lxml2 -lpam -lssl -lcrypto-lgssapi_krb5 -lz -lreadline -ltermcap -lnsl -lsocket -lm -o initdb Undefined first referenced symbol in file RmgrTable initdb.o ld: fatal: symbol referencing errors I think this means the static inline functions you added in xlog_internal.h need to be wrapped in #ifndef FRONTEND. regards, tom lane