Thread: split rm_name and rm_desc out of rmgr.c

split rm_name and rm_desc out of rmgr.c

From
Alvaro Herrera
Date:
Hi,

pg_xlogdump needs access to the *_desc functions for each rmgr.  We
already moved forward quite a bit by splitting those functions out of
their containing files; so now they are compilable separately.  Good.
The remaining task is enabling the code to find those functions in the
first place; currently, the function pointers live in rmgr.c which is
not compilable by frontend code because it contains pointers to other
functions.  Hence the attached patch splits RmgrData into two; the names
and rm_desc functions go into a new file which can be compiled easily by
frontend.

Proposed patch attached.

This comes from
http://www.postgresql.org/message-id/20130204180327.GH4963@alvh.no-ip.org
which is part of the pg_xlogdump patch in commitfest.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: split rm_name and rm_desc out of rmgr.c

From
Simon Riggs
Date:
On 4 February 2013 20:57, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> pg_xlogdump needs access to the *_desc functions for each rmgr.  We
> already moved forward quite a bit by splitting those functions out of
> their containing files; so now they are compilable separately.  Good.
> The remaining task is enabling the code to find those functions in the
> first place; currently, the function pointers live in rmgr.c which is
> not compilable by frontend code because it contains pointers to other
> functions.  Hence the attached patch splits RmgrData into two; the names
> and rm_desc functions go into a new file which can be compiled easily by
> frontend.
>
> Proposed patch attached.

Not meaning to cause you extra work, but some kind of id as the first
field of each structure would allow a cross-check that there is no
misconfiguration.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: split rm_name and rm_desc out of rmgr.c

From
Alvaro Herrera
Date:
Simon Riggs wrote:
> On 4 February 2013 20:57, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> > pg_xlogdump needs access to the *_desc functions for each rmgr.  We
> > already moved forward quite a bit by splitting those functions out of
> > their containing files; so now they are compilable separately.  Good.
> > The remaining task is enabling the code to find those functions in the
> > first place; currently, the function pointers live in rmgr.c which is
> > not compilable by frontend code because it contains pointers to other
> > functions.  Hence the attached patch splits RmgrData into two; the names
> > and rm_desc functions go into a new file which can be compiled easily by
> > frontend.
> >
> > Proposed patch attached.
>
> Not meaning to cause you extra work, but some kind of id as the first
> field of each structure would allow a cross-check that there is no
> misconfiguration.

Well, we could add the rmgr_id as the first struct member to both
tables and add an Assert() somewhere in xlog.c.  However, do we really
need that?  These tables are almost immutable, and failure to edit both
simultaneously should be promptly detected.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: split rm_name and rm_desc out of rmgr.c

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> pg_xlogdump needs access to the *_desc functions for each rmgr.  We
> already moved forward quite a bit by splitting those functions out of
> their containing files; so now they are compilable separately.  Good.
> The remaining task is enabling the code to find those functions in the
> first place; currently, the function pointers live in rmgr.c which is
> not compilable by frontend code because it contains pointers to other
> functions.  Hence the attached patch splits RmgrData into two; the names
> and rm_desc functions go into a new file which can be compiled easily by
> frontend.

> Proposed patch attached.

This seems pretty ugly to me.

Couldn't we do something similar to the design for SQL keyword constants,
wherein the actual data is in macros in a header file (providing exactly
one source of truth for each RM) and then various .c files can #include
that after #defining the macro as they need?  See
src/include/parser/kwlist.h and the files that include that.
        regards, tom lane



Re: split rm_name and rm_desc out of rmgr.c

From
Andres Freund
Date:
On 2013-02-04 17:27:39 -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > pg_xlogdump needs access to the *_desc functions for each rmgr.  We
> > already moved forward quite a bit by splitting those functions out of
> > their containing files; so now they are compilable separately.  Good.
> > The remaining task is enabling the code to find those functions in the
> > first place; currently, the function pointers live in rmgr.c which is
> > not compilable by frontend code because it contains pointers to other
> > functions.  Hence the attached patch splits RmgrData into two; the names
> > and rm_desc functions go into a new file which can be compiled easily by
> > frontend.
> 
> > Proposed patch attached.
> 
> This seems pretty ugly to me.
> 
> Couldn't we do something similar to the design for SQL keyword constants,
> wherein the actual data is in macros in a header file (providing exactly
> one source of truth for each RM) and then various .c files can #include
> that after #defining the macro as they need?  See
> src/include/parser/kwlist.h and the files that include that.

My original patch just surrounded the backend-only functions by a macro
which was defined to NULL in #ifdef FRONTEND. I still think thats the
best way to go. People objected to that though...
Other than that I find duplicating the whole table by far the best
approach. Splitting parts off seems to be more complex than warranted
without actually getting rid of duplication.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: split rm_name and rm_desc out of rmgr.c

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > pg_xlogdump needs access to the *_desc functions for each rmgr.  We
> > already moved forward quite a bit by splitting those functions out of
> > their containing files; so now they are compilable separately.  Good.
> > The remaining task is enabling the code to find those functions in the
> > first place; currently, the function pointers live in rmgr.c which is
> > not compilable by frontend code because it contains pointers to other
> > functions.  Hence the attached patch splits RmgrData into two; the names
> > and rm_desc functions go into a new file which can be compiled easily by
> > frontend.
>
> > Proposed patch attached.
>
> This seems pretty ugly to me.
>
> Couldn't we do something similar to the design for SQL keyword constants,
> wherein the actual data is in macros in a header file (providing exactly
> one source of truth for each RM) and then various .c files can #include
> that after #defining the macro as they need?  See
> src/include/parser/kwlist.h and the files that include that.

Meh.  I proposed this months ago and was shot down.  I still like it
better than what I propose here, so I will resurrect it.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: split rm_name and rm_desc out of rmgr.c

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Couldn't we do something similar to the design for SQL keyword constants,
> wherein the actual data is in macros in a header file (providing exactly
> one source of truth for each RM) and then various .c files can #include
> that after #defining the macro as they need?

Here are two patches implementing this idea.  The first one is simpler
and just replaces the table in rmgr.c with an appropriate PG_RMGR
define.

The second one touches rmgr.h as well.  That file currently has a list
of #defines with symbolic rmgr names and their numeric IDs.  The
approach in the second patch is to turn these into "extern const RmgrId"
instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns
them the values as consts.  This has the disadvantage that the array
size RM_MAX_ID cannot use the symbolic RM_SPGIST_ID value, but instead
it needs a literal "16".  Otherwise the compile complains:
error: variably modified ‘RmgrTable’ at file scope

I am not too sure about that second change.  However, I'm a bit uneasy
about leaving a second list of rmgrs around; since we're creating what
should be the canonical list of rmgrs, it makes sense to reduce the two
copies we have.  Better ideas for that second list are welcome.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: split rm_name and rm_desc out of rmgr.c

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Couldn't we do something similar to the design for SQL keyword constants,
>> wherein the actual data is in macros in a header file (providing exactly
>> one source of truth for each RM) and then various .c files can #include
>> that after #defining the macro as they need?

> Here are two patches implementing this idea.  The first one is simpler
> and just replaces the table in rmgr.c with an appropriate PG_RMGR
> define.

> The second one touches rmgr.h as well.  That file currently has a list
> of #defines with symbolic rmgr names and their numeric IDs.

Unifying that with this one-source-of-truth seems attractive ...

> The
> approach in the second patch is to turn these into "extern const RmgrId"
> instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns
> them the values as consts.

... but I don't especially like that implementation, as it will result
in nonzero code bloat and runtime cost due to replacing all those
constants with global-variable references.  Couldn't you instead set it
up as an enum definition?  Something like

#define PG_RMGR(...)   sym = num,

typedef enum {
#include ...RM_NEXT_ID
} RmgrIds;

#define RM_LAST_ID (RM_NEXT_ID-1)

I'm not actually sure that we need the explicit numbers in the macros
if we do this.  That is, we could just have "#define PG_RMGR(...)   sym,"
and say that the order of the entries in rmgrlist.h is what defines
the manager IDs.  The original coding allowed for gaps in the ID list
but I don't see much value in that.
        regards, tom lane



Re: split rm_name and rm_desc out of rmgr.c

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:

> > The
> > approach in the second patch is to turn these into "extern const RmgrId"
> > instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns
> > them the values as consts.
>
> ... but I don't especially like that implementation, as it will result
> in nonzero code bloat and runtime cost due to replacing all those
> constants with global-variable references.  Couldn't you instead set it
> up as an enum definition?

That seems to work.  I would like to have some way of specifying that
the enum members should be of type RmgrId, but I don't think there's any
way to do that.

Patch attached.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: split rm_name and rm_desc out of rmgr.c

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Patch attached.

Seems like a couple of the comments could use updates:

>    * Note: RM_MAX_ID could be as much as 255 without breaking the XLOG file
>    * format, but we keep it small to minimize the size of RmgrTable[].

This is no longer particularly sensible, since we're no longer making
any provision for wasted RmgrIds.  Perhaps rephrase as "RM_MAX_ID must
fit in RmgrId; widening that type will affect the XLOG file format."

> +  * List of resource manager entries.  Note that order of entries defines the
> +  * numerical values of each rmgr's ID.
> +  *
> +  * Changes to this list possibly need a XLOG_PAGE_MAGIC bump.

Probably also a good idea to state explicitly that new entries should go
at the end to avoid moving the IDs of existing entries.

Works for me otherwise.
        regards, tom lane



Re: split rm_name and rm_desc out of rmgr.c

From
Peter Eisentraut
Date:
On 2/5/13 3:47 PM, Alvaro Herrera wrote:
> Tom Lane wrote:
>> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> 
>>> The
>>> approach in the second patch is to turn these into "extern const RmgrId"
>>> instead, and use a second inclusion of rmgrlist.h in rmgr.c that assigns
>>> them the values as consts.
>>
>> ... but I don't especially like that implementation, as it will result
>> in nonzero code bloat and runtime cost due to replacing all those
>> constants with global-variable references.  Couldn't you instead set it
>> up as an enum definition?
> 
> That seems to work.  I would like to have some way of specifying that
> the enum members should be of type RmgrId, but I don't think there's any
> way to do that.
> 
> Patch attached.

This has broken cpluspluscheck:

./src/include/access/rmgrlist.h:28:8: error: expected constructor, destructor, or type conversion before '(' token




Re: split rm_name and rm_desc out of rmgr.c

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> On 2/5/13 3:47 PM, Alvaro Herrera wrote:
>> Patch attached.

> This has broken cpluspluscheck:

> ./src/include/access/rmgrlist.h:28:8: error: expected constructor, destructor, or type conversion before '(' token

cpluspluscheck has an explicit exclusion for kwlist.h, looks like it
needs one for rmgrlist.h as well, since neither of them are meant to
compile standalone.
        regards, tom lane