On the right tool (was Re: Proper relational database?) - Mailing list pgsql-general

From Andrew Sullivan
Subject On the right tool (was Re: Proper relational database?)
Date
Msg-id 20160424045816.GF32370@crankycanuck.ca
Whole thread Raw
In response to Re: Proper relational database?  (<david@andl.org>)
Responses Re: On the right tool (was Re: Proper relational database?)
List pgsql-general
Hi,

On Sun, Apr 24, 2016 at 12:55:48PM +1000, david@andl.org wrote:
> But there is goodness there, and NoSQL is now just as hard to replace.

Indeed, I wasn't trying to make some point-and-laugh argument about
NoSQL.  I was just observing that, as with many new techniques, some
of the uses haven't really been thought out carefully.  (It doesn't
help that at least one of the early "successes" worked way better in
theory than in practice, and that whole businesses have been taken
down by the failure modes.  "Oooh, can't resolve conflict!  Oh well,
throw it all away!" is not a great way to store critical business
data.)

New technologies are hard.  Some regard Brooks's _The Mythical
Man-Month_ and Spolsky's "Things You Should Never Do, Part I" as
saying different things.  But I think they're in deep agreement on a
key point: understanding why the old approach is there is way harder
than figuring out that old approach; so there's a natural tendency to
replace rather than to understand and build further.

In Brooks, this leads to the communications death, which is one of the
ways that adding more people to a late project makes it later.  In
Spolsky, it yields the straightforward observation that reading code
is harder than writing it.  In both cases, though, the point is that
careful software development management is considerably harder than it
seems.  I think that those two works -- along with _Soul of a New
Machine_ -- impart certain basic things you really need to internalise
to see why so many large software projects are more about people's
egos than about actually making stuff better.  None of them says,
"Don't do new things."  But all militate towards understanding what
you're throwing away before you start work.

In I think 2003 or 2004 I read an article in _CACM_[1] that said (in
my reading) that Google proved CAP was true and that we had to get
over ourselves (I'm exaggerating for effect).  As a data guy, I found
this both troubling and influential, and I've thought about it a lot
since.  The thing I found compelling about it was the observation that
Google's approach to consistency was way better than good enough, so
one shouldn't care too much about durability or consistency.  The
thing that bothered me was the obvious counter-examples.  I came to
believe that the point I understood was obviously true in its domain,
and dangerously false in other cases.

In retrospect, is is obviously true that, if you understand your
domain well enough, many data handling techniques could be
appropriate.  But that's also _only_ true if you understand your
domain well enough: applying the wrong techniques to your data can be
seriously harmful, too.  This explains why various NoSQL techniques
are so powerful in some ways and yet often so frustrating to data
people.  It explains why the most successful distributed database ever
is the DNS, which is the wrong tool for nearly every job yet
fabulously successful in its job.  And it's an excellent way to
organise thinking about how to pick the right technology for a given
data situation.  For if you pick the wrong one, you might find you've
left a lot of the value in a data set practically inaccessible.  You
don't need perfect foresight.  But attending a little to what value
there is in your data can yield great dividends.

We shape our tools and then our tools shape us [2].  But in the
software world, we must be more mindful than ever that we understand
our tools -- the shapes that they take and that they make.
Historicism in software is no vice.  It is the path by which we learn
to make new mistakes, as opposed to the same mistake over again.

[1] Darned if I can find the article, but I confess some scepticism that
my original reading was what the authors intended.  Doesn't matter for
these purposes! :)

[2] Apparently, Marshall McLuhan didn't say this; instead, his tribune
John Culkin, SJ said it.  It's still an excellent point, whoever made it.

Best regards,

A

--
Andrew Sullivan
ajs@crankycanuck.ca


pgsql-general by date:

Previous
From:
Date:
Subject: Re: Proper relational database?
Next
From:
Date:
Subject: Re: On the right tool (was Re: Proper relational database?)