Re: Large C files - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Large C files
Date
Msg-id CAEYLb_WqdfDKYEYT4CVAx4Pss3QQWb+6BZkxtNDeAVZ7+GkkUA@mail.gmail.com
Whole thread Raw
In response to Re: Large C files  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 23 September 2011 15:46, Robert Haas <robertmhaas@gmail.com> wrote:
> I'm not opposed to adding something like this, but I think it needs to
> either be tied into the actual running of the script, or have a lot
> more documentation than it does now, or both.  I am possibly stupid,
> but I can't understand from reading the script (or, honestly, the
> thread) exactly what kind of pgrminclude-induced errors this is
> protecting against;

The basic idea is simple. There is a high likelihood that if removing
a header alters the behaviour of an object file silently, that it will
also alter the symbol table for the same object file - in particular,
the "value" of function symbols (their local offset in the object
file), which relates to the number of instructions in each function.
Yes, that is imperfect, but it's better than nothing, and intuitively
I think that the only things that it won't catch in practice are
completely inorganic bugs (i.e. things done for the express purpose of
proving that it can be broken). Thinking about it now though, I have
to wonder if I could have done a better job with "objdump -td".

> but even if we clarify that, it seems like it
> would be a lot of work to run it manually on all the files that might
> be affected by a pgrminclude run, unless we can somehow automate that.

I would have automated it if anyone had expressed any interest in the
basic idea - it might be an over-reaction to the problem. I'm not
sure. I'd have had it detect which object files might have been
affected (through directly including the header, or indirectly
including it by proxy). It could rename them such that, for example,
xlog.o was renamed to xlog_old.o . Then, you make your changes,
rebuild, and run the program again in a different mode. It notices the
*_old.o files, and runs nm-diff on them.

> I'm also curious to see how much more fallout we're going to see from
> that run.  We had a few glitches when it was first done, but it didn't
> seem like they were really all that bad.  It might be that we'd be
> better off running pgrminclude a lot *more* often (like once a cycle,
> or even after every CommitFest), because the scope of the changes
> would then be far smaller and we wouldn't be dealing with 5 years of
> accumulated cruft all at once; we'd also get a lot more experience
> with what works or does not work with the script, which might lead to
> improvements in that script on a less-than-geologic time scale.

Fair point. I'm a little busy with other things right now, but I'll
revisit it soon.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [PATCH] Use new oom_score_adj without a new compile-time constant
Next
From: Robert Haas
Date:
Subject: Re: [v9.2] "database" object class of contrib/sepgsql