Thread: [HACKERS] pg_waldump's inclusion of backend headers is a mess

[HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Robert Haas
Date:
Dilip complained on another thread about the apparent difficulty of
getting the system to compile after adding a reference to dsa_pointer
to tidbitmap.c:

https://www.postgresql.org/message-id/CAFiTN-u%3DBH_b0QE9FG%2B%2B_Ht6Q%3DF84am%3DJ-rt7fNbQkq3%2BqX3VQ%40mail.gmail.com

I dug into the problem and discovered that pg_waldump is slurping up a
tremendous crapload of backend headers.  It includes gin.h,
gist_private.h, hash_xlog.h, nbtree.h, and spgist.h, which all end up
including amapi.h, which includes genam.h, which includes tidbitmap.h.
When you make tidbitmap.h include utils/dsa.h, it in turn includes
port/atomics.h, which has an #error preventing frontend includes.

There are a number of ways to fix this problem; probably the cheapest
available hack is to stick #ifndef FRONTEND around the additional
stuff getting added to tidbitmap.h.  But that seems to be attacking
the problem from the wrong end.  The problem isn't really that having
tidbitmap.h refer to backend-only stuff is unreasonable; it's that
pg_waldump is skating on thin ice by importing so many backend-only
headers.  It's only a matter of time before some other one of the many
files that it directly or indirectly includes develops a similar
problem.

Therefore, I proposed the attached patch, which splits spgxlog.h out
of spgist_private.h, nbtxlog.h out of nbtree.h, gistxlog.h out of
gist_private.h, and ginxlog.h and ginblock.h out of gin_private.h.
These new header files are included by pg_waldump in lieu of the
"private" versions.  This solves the immediate problem and I suspect
it will head off future problems as well.

Thoughts, comments, objections, better ideas?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Michael Paquier
Date:
On Tue, Feb 14, 2017 at 12:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> Therefore, I proposed the attached patch, which splits spgxlog.h out
> of spgist_private.h, nbtxlog.h out of nbtree.h, gistxlog.h out of
> gist_private.h, and ginxlog.h and ginblock.h out of gin_private.h.
> These new header files are included by pg_waldump in lieu of the
> "private" versions.  This solves the immediate problem and I suspect
> it will head off future problems as well.
>
> Thoughts, comments, objections, better ideas?

+1 for the overall move. You may want to name the new headers
dedicated to WAL records with _xlog.h as suffix though, like
gin_xlog.h instead of ginxlog.h. All the existing RMGRs doing this
separation (heap, brin, hash and generic) use this format.

The patch looks rather sane to me.

--- /dev/null
+++ b/src/include/access/nbtxlog.h
@@ -0,0 +1,255 @@
+/*-------------------------------------------------------------------------
+ *
+ * nbtree.h
+ *      header file for postgres btree xlog routines
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/access/nbxlog.h
+ *
+ *-------------------------------------------------------------------------
+ */
This file name is incorrect, as well as the description.

+/*--------------------------------------------------------------------------
+ * ginxlog.h
+ *       header file for postgres inverted index xlog implementation.
+ *
+ *     Copyright (c) 2006-2017, PostgreSQL Global Development Group
+ *
+ *     src/include/access/gin_private.h
+ *--------------------------------------------------------------------------
+ */
Bip. Error.
-- 
Michael



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Fabien COELHO
Date:
> You may want to name the new headers dedicated to WAL records with 
> _xlog.h as suffix though, like gin_xlog.h instead of ginxlog.h.

Should not it be more consistent to use "*_wal.h", after all these efforts 
to move "xlog" to "wal" everywhere?

-- 
Fabien.



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Robert Haas
Date:
On Tue, Feb 14, 2017 at 12:23 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Tue, Feb 14, 2017 at 12:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Therefore, I proposed the attached patch, which splits spgxlog.h out
>> of spgist_private.h, nbtxlog.h out of nbtree.h, gistxlog.h out of
>> gist_private.h, and ginxlog.h and ginblock.h out of gin_private.h.
>> These new header files are included by pg_waldump in lieu of the
>> "private" versions.  This solves the immediate problem and I suspect
>> it will head off future problems as well.
>>
>> Thoughts, comments, objections, better ideas?
>
> +1 for the overall move. You may want to name the new headers
> dedicated to WAL records with _xlog.h as suffix though, like
> gin_xlog.h instead of ginxlog.h. All the existing RMGRs doing this
> separation (heap, brin, hash and generic) use this format.

I thought about that but it seemed more important to be consistent
with the .c files.  generic_xlog.c and brin_xlog.c have an underscore,
but none of the others do.  I thought it would be too strange to have
e.g. ginxlog.c and gin_xlog.h.

> The patch looks rather sane to me.

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Robert Haas
Date:
On Tue, Feb 14, 2017 at 2:45 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> You may want to name the new headers dedicated to WAL records with _xlog.h
>> as suffix though, like gin_xlog.h instead of ginxlog.h.
>
> Should not it be more consistent to use "*_wal.h", after all these efforts
> to move "xlog" to "wal" everywhere?

I believe that what was agreed was to eliminate "xlog" from
user-facing parts of the system, not internal details.  If we're going
to eliminate it from the internals, we should do that in a systematic
way, not just in the parts that happen to be getting changed from by
some other patch.  But personally I think that would be more trouble
than it's worth.  It would severely complicate future back-patching --
even more than what we've done already -- for not a whole lot of gain.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Fabien COELHO
Date:
> I believe that what was agreed was to eliminate "xlog" from
> user-facing parts of the system, not internal details. [...]

Ok, I get it. So xlog stays xlog if it is hidden from the user, eg file 
names and probably unexposed functions names, structures or whatever, but 
everything else has been renamed. Fine.

-- 
Fabien.



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Robert Haas
Date:
On Tue, Feb 14, 2017 at 8:15 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> I believe that what was agreed was to eliminate "xlog" from
>> user-facing parts of the system, not internal details. [...]
> Ok, I get it. So xlog stays xlog if it is hidden from the user, eg file
> names and probably unexposed functions names, structures or whatever, but
> everything else has been renamed. Fine.

Yeah, generally we didn't rename source files, functions, structures,
or anything like that.  It's still assumed that, if you are developing
core or extension code against PostgreSQL, you know that "xlog" means
"WAL".  But hopefully users won't have to know that any more, once we
get past the inevitable migration pain caused by the changes in v10.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Andres Freund
Date:
Hi,

On 2017-02-13 22:42:00 -0500, Robert Haas wrote:
> I dug into the problem and discovered that pg_waldump is slurping up a
> tremendous crapload of backend headers.  It includes gin.h,
> gist_private.h, hash_xlog.h, nbtree.h, and spgist.h, which all end up
> including amapi.h, which includes genam.h, which includes tidbitmap.h.

Right. Hard to avoid with the current organization - IIRC Alvaro, I (and
others?) had talked about doing more agressive reorganization first, but
the patch was already big enough...


> When you make tidbitmap.h include utils/dsa.h, it in turn includes
> port/atomics.h, which has an #error preventing frontend includes

Has to, because otherwise there's undefined variable/symbol references
on some, but not all, platforms.


> There are a number of ways to fix this problem; probably the cheapest
> available hack is to stick #ifndef FRONTEND around the additional
> stuff getting added to tidbitmap.h.  But that seems to be attacking
> the problem from the wrong end.

Agreed.


> Therefore, I proposed the attached patch, which splits spgxlog.h out
> of spgist_private.h, nbtxlog.h out of nbtree.h, gistxlog.h out of
> gist_private.h, and ginxlog.h and ginblock.h out of gin_private.h.
> These new header files are included by pg_waldump in lieu of the
> "private" versions.  This solves the immediate problem and I suspect
> it will head off future problems as well.
> 
> Thoughts, comments, objections, better ideas?

No better ideas.  I'm a bit concerned about declarations needed both by
normal and xlog related routines, but I guess that can be solved by a
third header as you did.


> +++ b/src/include/access/nbtxlog.h
> @@ -0,0 +1,255 @@
> +/*-------------------------------------------------------------------------
> + *
> + * nbtree.h
> + *      header file for postgres btree xlog routines

Wrong file name.


Greetings,

Andres Freund



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Robert Haas
Date:
On Tue, Feb 14, 2017 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote:
>> Thoughts, comments, objections, better ideas?
>
> No better ideas.  I'm a bit concerned about declarations needed both by
> normal and xlog related routines, but I guess that can be solved by a
> third header as you did.

Yeah, that doesn't seem bad to me.  I think it's actually fairly
unfortunate that we've just shoved declarations from 3 or 4 or 5
different source files into a single header in many of these cases.  I
think it leads to not thinking clearly about what the dependencies
between the different source files in the index AM stuff is, and
certainly there seems to be some room for improvement there,
especially with regard to gist and gin.  Sorting that out is a bigger
project than I'm prepared to undertake right now, but I think this is
probably a step in the right direction.

>> +++ b/src/include/access/nbtxlog.h
>> @@ -0,0 +1,255 @@
>> +/*-------------------------------------------------------------------------
>> + *
>> + * nbtree.h
>> + *     header file for postgres btree xlog routines
>
> Wrong file name.

Thanks to you and Michael for the reviews.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] pg_waldump's inclusion of backend headers is a mess

From
Alvaro Herrera
Date:
Robert Haas wrote:
> On Tue, Feb 14, 2017 at 2:54 PM, Andres Freund <andres@anarazel.de> wrote:
> >> Thoughts, comments, objections, better ideas?
> >
> > No better ideas.  I'm a bit concerned about declarations needed both by
> > normal and xlog related routines, but I guess that can be solved by a
> > third header as you did.
> 
> Yeah, that doesn't seem bad to me.  I think it's actually fairly
> unfortunate that we've just shoved declarations from 3 or 4 or 5
> different source files into a single header in many of these cases.  I
> think it leads to not thinking clearly about what the dependencies
> between the different source files in the index AM stuff is, and
> certainly there seems to be some room for improvement there,
> especially with regard to gist and gin.

Agreed -- on a very quick read, I like the way you split the GIN
headers.

I don't think the concern is really the number of source files involved,
because I think in some places we're bad at splitting source files
sensibly.

> Sorting that out is a bigger
> project than I'm prepared to undertake right now, but I think this is
> probably a step in the right direction.

Yes, I think so too.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services