Thread: split rm_name and rm_desc out of rmgr.c
Hi, pg_xlogdump needs access to the *_desc functions for each rmgr. We already moved forward quite a bit by splitting those functions out of their containing files; so now they are compilable separately. Good. The remaining task is enabling the code to find those functions in the first place; currently, the function pointers live in rmgr.c which is not compilable by frontend code because it contains pointers to other functions. Hence the attached patch splits RmgrData into two; the names and rm_desc functions go into a new file which can be compiled easily by frontend. Proposed patch attached. This comes from http://www.postgresql.org/message-id/20130204180327.GH4963@alvh.no-ip.org which is part of the pg_xlogdump patch in commitfest. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 4 February 2013 20:57, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > pg_xlogdump needs access to the *_desc functions for each rmgr. We > already moved forward quite a bit by splitting those functions out of > their containing files; so now they are compilable separately. Good. > The remaining task is enabling the code to find those functions in the > first place; currently, the function pointers live in rmgr.c which is > not compilable by frontend code because it contains pointers to other > functions. Hence the attached patch splits RmgrData into two; the names > and rm_desc functions go into a new file which can be compiled easily by > frontend. > > Proposed patch attached. Not meaning to cause you extra work, but some kind of id as the first field of each structure would allow a cross-check that there is no misconfiguration. -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs wrote: > On 4 February 2013 20:57, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > > pg_xlogdump needs access to the *_desc functions for each rmgr. We > > already moved forward quite a bit by splitting those functions out of > > their containing files; so now they are compilable separately. Good. > > The remaining task is enabling the code to find those functions in the > > first place; currently, the function pointers live in rmgr.c which is > > not compilable by frontend code because it contains pointers to other > > functions. Hence the attached patch splits RmgrData into two; the names > > and rm_desc functions go into a new file which can be compiled easily by > > frontend. > > > > Proposed patch attached. > > Not meaning to cause you extra work, but some kind of id as the first > field of each structure would allow a cross-check that there is no > misconfiguration. Well, we could add the rmgr_id as the first struct member to both tables and add an Assert() somewhere in xlog.c. However, do we really need that? These tables are almost immutable, and failure to edit both simultaneously should be promptly detected. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > pg_xlogdump needs access to the *_desc functions for each rmgr. We > already moved forward quite a bit by splitting those functions out of > their containing files; so now they are compilable separately. Good. > The remaining task is enabling the code to find those functions in the > first place; currently, the function pointers live in rmgr.c which is > not compilable by frontend code because it contains pointers to other > functions. Hence the attached patch splits RmgrData into two; the names > and rm_desc functions go into a new file which can be compiled easily by > frontend. > Proposed patch attached. This seems pretty ugly to me. Couldn't we do something similar to the design for SQL keyword constants, wherein the actual data is in macros in a header file (providing exactly one source of truth for each RM) and then various .c files can #include that after #defining the macro as they need? See src/include/parser/kwlist.h and the files that include that. regards, tom lane
On 2013-02-04 17:27:39 -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > pg_xlogdump needs access to the *_desc functions for each rmgr. We > > already moved forward quite a bit by splitting those functions out of > > their containing files; so now they are compilable separately. Good. > > The remaining task is enabling the code to find those functions in the > > first place; currently, the function pointers live in rmgr.c which is > > not compilable by frontend code because it contains pointers to other > > functions. Hence the attached patch splits RmgrData into two; the names > > and rm_desc functions go into a new file which can be compiled easily by > > frontend. > > > Proposed patch attached. > > This seems pretty ugly to me. > > Couldn't we do something similar to the design for SQL keyword constants, > wherein the actual data is in macros in a header file (providing exactly > one source of truth for each RM) and then various .c files can #include > that after #defining the macro as they need? See > src/include/parser/kwlist.h and the files that include that. My original patch just surrounded the backend-only functions by a macro which was defined to NULL in #ifdef FRONTEND. I still think thats the best way to go. People objected to that though... Other than that I find duplicating the whole table by far the best approach. Splitting parts off seems to be more complex than warranted without actually getting rid of duplication. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > pg_xlogdump needs access to the *_desc functions for each rmgr. We > > already moved forward quite a bit by splitting those functions out of > > their containing files; so now they are compilable separately. Good. > > The remaining task is enabling the code to find those functions in the > > first place; currently, the function pointers live in rmgr.c which is > > not compilable by frontend code because it contains pointers to other > > functions. Hence the attached patch splits RmgrData into two; the names > > and rm_desc functions go into a new file which can be compiled easily by > > frontend. > > > Proposed patch attached. > > This seems pretty ugly to me. > > Couldn't we do something similar to the design for SQL keyword constants, > wherein the actual data is in macros in a header file (providing exactly > one source of truth for each RM) and then various .c files can #include > that after #defining the macro as they need? See > src/include/parser/kwlist.h and the files that include that. Meh. I proposed this months ago and was shot down. I still like it better than what I propose here, so I will resurrect it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Tom Lane wrote: > Couldn't we do something similar to the design for SQL keyword constants, > wherein the actual data is in macros in a header file (providing exactly > one source of truth for each RM) and then various .c files can #include > that after #defining the macro as they need? Here are two patches implementing this idea. The first one is simpler and just replaces the table in rmgr.c with an appropriate PG_RMGR define. The second one touches rmgr.h as well. That file currently has a list of #defines with symbolic rmgr names and their numeric IDs. The approach in the second patch is to turn these into "extern const RmgrId" instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns them the values as consts. This has the disadvantage that the array size RM_MAX_ID cannot use the symbolic RM_SPGIST_ID value, but instead it needs a literal "16". Otherwise the compile complains: error: variably modified ‘RmgrTable’ at file scope I am not too sure about that second change. However, I'm a bit uneasy about leaving a second list of rmgrs around; since we're creating what should be the canonical list of rmgrs, it makes sense to reduce the two copies we have. Better ideas for that second list are welcome. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Couldn't we do something similar to the design for SQL keyword constants, >> wherein the actual data is in macros in a header file (providing exactly >> one source of truth for each RM) and then various .c files can #include >> that after #defining the macro as they need? > Here are two patches implementing this idea. The first one is simpler > and just replaces the table in rmgr.c with an appropriate PG_RMGR > define. > The second one touches rmgr.h as well. That file currently has a list > of #defines with symbolic rmgr names and their numeric IDs. Unifying that with this one-source-of-truth seems attractive ... > The > approach in the second patch is to turn these into "extern const RmgrId" > instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns > them the values as consts. ... but I don't especially like that implementation, as it will result in nonzero code bloat and runtime cost due to replacing all those constants with global-variable references. Couldn't you instead set it up as an enum definition? Something like #define PG_RMGR(...) sym = num, typedef enum { #include ...RM_NEXT_ID } RmgrIds; #define RM_LAST_ID (RM_NEXT_ID-1) I'm not actually sure that we need the explicit numbers in the macros if we do this. That is, we could just have "#define PG_RMGR(...) sym," and say that the order of the entries in rmgrlist.h is what defines the manager IDs. The original coding allowed for gaps in the ID list but I don't see much value in that. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > The > > approach in the second patch is to turn these into "extern const RmgrId" > > instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns > > them the values as consts. > > ... but I don't especially like that implementation, as it will result > in nonzero code bloat and runtime cost due to replacing all those > constants with global-variable references. Couldn't you instead set it > up as an enum definition? That seems to work. I would like to have some way of specifying that the enum members should be of type RmgrId, but I don't think there's any way to do that. Patch attached. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Patch attached. Seems like a couple of the comments could use updates: > * Note: RM_MAX_ID could be as much as 255 without breaking the XLOG file > * format, but we keep it small to minimize the size of RmgrTable[]. This is no longer particularly sensible, since we're no longer making any provision for wasted RmgrIds. Perhaps rephrase as "RM_MAX_ID must fit in RmgrId; widening that type will affect the XLOG file format." > + * List of resource manager entries. Note that order of entries defines the > + * numerical values of each rmgr's ID. > + * > + * Changes to this list possibly need a XLOG_PAGE_MAGIC bump. Probably also a good idea to state explicitly that new entries should go at the end to avoid moving the IDs of existing entries. Works for me otherwise. regards, tom lane
On 2/5/13 3:47 PM, Alvaro Herrera wrote: > Tom Lane wrote: >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: > >>> The >>> approach in the second patch is to turn these into "extern const RmgrId" >>> instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns >>> them the values as consts. >> >> ... but I don't especially like that implementation, as it will result >> in nonzero code bloat and runtime cost due to replacing all those >> constants with global-variable references. Couldn't you instead set it >> up as an enum definition? > > That seems to work. I would like to have some way of specifying that > the enum members should be of type RmgrId, but I don't think there's any > way to do that. > > Patch attached. This has broken cpluspluscheck: ./src/include/access/rmgrlist.h:28:8: error: expected constructor, destructor, or type conversion before '(' token
Peter Eisentraut <peter_e@gmx.net> writes: > On 2/5/13 3:47 PM, Alvaro Herrera wrote: >> Patch attached. > This has broken cpluspluscheck: > ./src/include/access/rmgrlist.h:28:8: error: expected constructor, destructor, or type conversion before '(' token cpluspluscheck has an explicit exclusion for kwlist.h, looks like it needs one for rmgrlist.h as well, since neither of them are meant to compile standalone. regards, tom lane