Re: Extensible Rmgr for Table AMs - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Extensible Rmgr for Table AMs
Date
Msg-id 20220204144801.wuwcansdzz3w2nn3@jrouhaud
Whole thread Raw
In response to Re: Extensible Rmgr for Table AMs  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Extensible Rmgr for Table AMs
List pgsql-hackers
Hi,

On Fri, Feb 04, 2022 at 09:10:42AM -0500, Robert Haas wrote:
> On Thu, Feb 3, 2022 at 12:34 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> > I agree that having dozen of custom rmgrs doesn't seem likely, but I also have
> > no idea of how much overhead you get by not doing a direct array access.  I
> > think it would be informative to benchmark something like simple OLTP write
> > workload on a fast storage (or a ramdisk, or with fsync off...), with the used
> > rmgr being the 1st and the 2nd custom rmgr.  Both scenario still seems
> > plausible and shouldn't degenerate on good hardware.
> 
> I think it would be hard to measure the overhead of this approach on a
> macrobenchmark.

Yeah that's also my initial thought, but I wouldn't be terribly surprised to be
wrong.

> That having been said, I find this a surprising
> implementation choice. I think that the approaches that are most worth
> considering are:
> 
> (1) reallocate the array if needed so that we can continue to just do
> RmgrTable[rmid]
> (2) have one array for builtins and a second array for extensions and
> do rmid < RM_CUSTOM_MIN_ID ? BuiltinRmgrTable[rmid] :
> ExtensionRmgrTable[rmid]
> (3) change RmgrTable to be an array of pointers to structs rather than
> an an array of structs. then the structs don't move around and can be
> const, but the pointers can be moved into a larger array if required
> 
> I'm not really sure which is best. My intuition for what will be
> cheapest on modern hardware is pretty shaky. However, I can't see how
> it can be the thing the patch is doing now; a linear search seems like
> it has to be the slowest option.

I guess the idea was to have a compromise between letting rmgr authors choose
arbitrary ids to avoid any conflicts, especially with private implementations,
without wasting too much memory.  But those approaches would be pretty much
incompatible with the current definition:

+#define RM_CUSTOM_MIN_ID       128
+#define RM_CUSTOM_MAX_ID       UINT8_MAX

even if you only allocate up to the  max id found, nothing guarantees that you
won't get a quite high id.



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pg_walfile_name uses XLByteToPrevSeg
Next
From: Robert Haas
Date:
Subject: Re: Extensible Rmgr for Table AMs