Re: autogenerating headers & bki stuff - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: autogenerating headers & bki stuff |
Date | |
Msg-id | 603c8f070907251317s20c929e3td4e574d38d6a9364@mail.gmail.com Whole thread Raw |
In response to | Re: autogenerating headers & bki stuff (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: autogenerating headers & bki stuff
Re: autogenerating headers & bki stuff |
List | pgsql-hackers |
On Sat, Jul 25, 2009 at 10:56 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Sat, Jul 25, 2009 at 3:21 AM, Peter Eisentraut<peter_e@gmx.net> wrote: >>> I think a less invasive change would be to include anum.h into all the >>> catalog/pg_*.h headers, so that the external interface stays the same. > >> Gah. I wish a toplevel make would build "contrib". > >> Anyway, yeah, we could do that. The downsides to that approach are: > > I didn't realize this change was intending to throw all the Anum_ > constants into a single header file. I am strongly against that > on namespace pollution grounds, I don't really understand this objection. The reason why namespace pollution is bad is because there's a risk that someone might be using one of the names used for some other purpose, but the chances that someone who is using a Anum_pg_* or Natts_pg_* constant also needs a similarly named constant for some purpose other than referencing the PostgreSQL catalogs seems so as nearly zero as makes no difference. The hypothetical scenario in which this is a problem goes something like this: someone is counting on the fact that if they include "catalog/pg_foo.h", then Anum_pg_foo_* and Natts_pg_foo will be defined appropriately for reference to PostgreSQL backend catalogs, but they are also counting on the fact that Anum_pg_bar_* and Natts_pg_bar (for some value of bar that collides with a system catalog name) are not defined and that they can use those constants for their own internal purposes. When they port their code to PG 8.5, they are forced into changing the naming of those constants, because it's no longer possible to just get the pg_foo constants without the pg_bar constants. If anyone is really doing this, I submit that it's a horribly bad idea and they ought to stop right away whether this patch gets committed or not. > quite aside from the massive #include > restructuring it'd require. This is all done in the patch (with the exception of a handful of loose ends that Peter found in his review) and I don't think it's all that massive. > And then there's the fact that any change > in such a file would force rebuild of just about the entire backend. It requires a rebuild of 56 of 547 files '*.c' files in src/backend, which is to say 10.2% of the backend. Also, the system is set up in such a way that the timestamp on catalog/anum.h changes only when its contents actually change, and dependencies are not rebuilt otherwise. So basically it'll happen when someone adds an attribute to, or removes one from, a system catalog: the fact that the .h file was updated in some other way is not sufficient. > I do not see any virtue in autogenerating the Anum_ constants anyway. > Yeah, manually updating them is a bit of a pain, but it's only a tiny > part of the work that's normally involved in changing a system catalog. Well, I'd like to work on fixing some of the other problems too, but this seems like a good place to start. Currently, if there are two uncommitted patches that make changes to the system catalog, whichever is committed first is 100% guaranteed to conflict with each other, and the resolution is typically painful. Of course, fixing the Anum and Natts declarations does not come close to fixing this problem: for catalogs that are initialized with any data at bootstrap time, the DATA() lines are a much bigger issue, but fixing that is going to require a bigger hammer than can be put in place with one patch. I do think this is a pretty good foundation on which to build, though. > In any case, practically all of the benefit involved could be gotten > by just not having to mess with the numerical values of the individual > constants. Which we could do by setting them up as enums instead of > macros, along the lines of > http://archives.postgresql.org/pgsql-committers/2008-05/msg00080.php I'd certainly be willing to concede that some of the benefit could be gotten that way, but I'm not sure I agree with "practically all". The benefits of this patch as I see them are: (1) to reduce the number of places where a catalog change creates a merge conflict, and (2) to eliminate the possibility of human error in setting up the Anum and Natts declarations. The fact that I found a case where this had been done inconsistently in pg_listener (and no one noticed for 10 years) provides that this is not an entirely hypothetical possibility even for committed code, and I've definitely screwed it up a few times in my own tree, too. Replacing the declarations with enums would make the merge conflicts involve fewer lines and maybe slightly simplify the manual updating process, but it won't completely solve either problem. ...Robert
pgsql-hackers by date: