Re: Woo hoo ... a whole new set of compiler headaches!! - Mailing list pgsql-hackers

From Dave Held
Subject Re: Woo hoo ... a whole new set of compiler headaches!!
Date
Msg-id 49E94D0CFCD4DB43AFBA928DDD20C8F9026184D0@asg002.asg.local
Whole thread Raw
Responses Re: Woo hoo ... a whole new set of compiler headaches!!  (Neil Conway <neilc@samurai.com>)
List pgsql-hackers
> -----Original Message-----
> From: Dann Corbit [mailto:DCorbit@connx.com]
> Sent: Friday, April 22, 2005 1:08 PM
> To: Andrew Dunstan; Dave Held
> Cc: pgsql-hackers@postgresql.org
> Subject: RE: [HACKERS] Woo hoo ... a whole new set of compiler
> headaches!!
>
> > From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> > owner@postgresql.org] On Behalf Of Andrew Dunstan
> >
> > Dave Held wrote:
> >
> > >I see the smiley, but moving to C++ isn't just about switching
> > >to the latest fad language.
> >
> > No, it's about moving to the fad language about 2
> > generations back ...

Except that C++ is hardly a "fad language".  Most estimates place
the top 3 languages by number of programmers as C++, Java, and C,
with C++ and Java switching positions, and C lagging behind by a
decent margin.  And it's been this way for quite a while.

> Language wars are about as much fun as operating system wars.
> C and C++ are both nice languages.

My intent was not to start a language jihad.  I don't think C is a
bad language.  I just think C++ is better.

> > [...]
> > Unless you did a major rewrite it's hard to see any great
> > advantages.

There doesn't need to be great advantages.  Just enough to justify
the effort.  Take casts, for instance: C uses a single syntax for
all types of cast.  C++ breaks casting down into static_casts<>
(for upcasting to a type for which the opposite cast would be an
implicit conversion), dynamic_cast<> (for polymorphic types, which
don't exist in the C++ sense in the Postgres codebase, by definition),
const_cast<> (for casting away constness, or adding it), and
reinterpret_cast<> (for dangerous bit twiddling where you'd better
know the exact layout for each platform).  Probably most of the
casts in the Postgres codebase would be converted to static_cast<>.
And if the types were upgraded over time, many of those casts could
probably go away.  The dangerous casts would remain marked as
reinterpret_cast<>, and that would serve to highlight the portions
of the code that are probably platform-dependent and that probably
need inspection when porting to a new platform.

While the benefits of using the C++-style casts would only be
maximized if they were used everywhere, you still get incremental
benefit from converting them a few at a time.

Consider inline functions.  In C, you have to implement them as
macros, which eliminates your type safety.  In C++, you can get
both type safety and performance.  Take a concrete example: qsort().
In C, you must pass a function pointer to use this function, and
that function pointer gets dereferenced every time qsort() needs to
do a comparison.  That's a lot of overhead that is eliminated in
C++'s sort() function, which accepts a comparison functor that
can and often does get inlined.

> > There are over 600,000 lines of code in Postgres by my rough
> > count. The potential rewrite effort is enormous. A thorough
> > job would probably consume a release cycle just on its own.
>
> You could use C++ as "a better C" with very little effort
> (but there are C++ keywords sprinkled here and there, so it
> would be a good month of work for somebody).

My point exactly.  It's *not* a task that has to be tackled all
at once.  Once you have a total C++ codebase, converting it into
a C++ style can be done quite incrementally, with benefits
accruing with each update.

> [...]
> > On the downside, some of us (including me) have much more
> > experience in and ease with writing C than C++. I could
> > certainly do it - I write plenty of Java, so OO isn't a closed
> > book to me, far from it - but ramping up in it would take me
> > at least some effort. I bet I'm not alone in that.
>
> This is the crux of the matter.  You will certainly not be alone
> here.  I (personally) prefer C++ to C, but I am comfortable in
> either language.  However, if you have a team of 100 C programmers
> and a huge C project, it is a terrible mistake to ask them to use
> C++.

I disagree.  The C programmers could learn C++ rules one at a time.
The first rule would simply to be to not use C++ keywords as
identifiers.  That is really the minimum necessary to write C style
code in a C++ program.  The next might be to replace macro constants
with const ints.  The next might be to replace C-style casts with
C++-style.  There is really no need to throw the whole book at
the developer community all at once.  It might take a year or two
to get the codebase into idiomatic C++, but the developers would
have learned C++ quite easily without really noticing it.  I would
certainly not suggest something radical like replacing hand-rolled
containers with standard library equivalents.  *That's* the kind of
rewrite that should give any coder nightmares.

Even OOP-style encapsulation could be done incrementally.  You take
a few fields of some struct, make them private, add accessor
functions, and update the references.  You don't have to hide all
the data all at once.  I know, because I've upgraded lots of C
code to C++, and it's not nearly as hard as the typical C
programmer thinks it is.  To make it all idiomatic C++ would
certainly require a large effort.  But you don't need purely
idiomatic C++ to gain most of the benefits of the language.

> > So this would not be cost-free - very far from it.
>
> Here, we clearly agree.  As a project scales larger and larger the
> benefits of C++ loom greater and greater.  When your project
> consists of millions of lines of code, then C++ is a much better
> choice.  But I don't think PostgreSQL is going to scale to titanic
> size any time soon. If it were, I would suggest the conversion, no
> matter how painful.

And this is where we disagree.  I think C++ offers tangible benefits
at all scales.  The cost is not free, but contrary to what most people
think, it doesn't have to be paid all at once.  It can be paid in
very small installments, and further change can even be more or less
abandoned if the team felt that it wasn't returning on the investment.
There really is no other language path that offers that kind of
gradual transition.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129


pgsql-hackers by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Bitmap scans vs. the statistics views
Next
From: "Dave Held"
Date:
Subject: Re: Woo hoo ... a whole new set of compiler headaches!! :)