Re: Research and EAV models - Mailing list pgsql-general

From Peter Hunsberger
Subject Re: Research and EAV models
Date
Msg-id cc159a4a0910231524x405fb3o2d3df7c8f53d9834@mail.gmail.com
Whole thread Raw
In response to Re: Research and EAV models  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Research and EAV models
List pgsql-general
On Fri, Oct 23, 2009 at 5:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Leif B. Kristensen" <leif@solumslekt.org> writes:
>> It seems like there are two camps considering EAV models. On the one
>> hand, there are researchers who think that EAV is a great way to meet
>> their objectives. On the other hand, there are the "business" guys who
>> thnk that EAV is crap.
>
> Well, no, it's not that EAV is crap.  It's that EAV is strong evidence
> that you're using the wrong tool for the job.  If a SQL database is
> actually a good fit for your application, then it should be possible to
> extract a stronger schema for your data.  If you cannot, then you
> probably should be storing your data in something else.  Otherwise
> you'll spend untold man-hours beating your head against assorted walls
> while you try to persuade the SQL database to do things it was never
> meant for, and coping with performance issues because the cases you need
> are not optimized nor indeed optimizable.  (I can just see that other
> guy trying to search on one of his "polymorphic" columns :-(.)

I can certainly see where Tom is coming from on this and I tend to
agree in general.  However, the reality of the research world is never
that clear cut. In particular, what you find is that you have some
split where some percentage of the domain is well understood and can
be placed in a conventional schema and some portion is not so well
understood and requires something more flexible.  You could try and
split the data between two databases but that itself is very
problematic.

My take on this, for the research world, is to not go pure EAV, but
rather normalize by some more generic concepts within the domain.  Eg.
"measurement", or "evaluation", etc. You might ultimately end up with
a sort of EAV model, but the "V" portion is strongly typed within the
database and you're not trying to cast a string into 20 conventional
data types. This still requires rigorous metadata management on the EA
side of the EAV model, but you can tackle that in many ways.

> SQL isn't the be-all and end-all of data storage.  It does relational
> stuff well, and other stuff poorly.

You can build variations on EAV that are closer to a regular
relational schema.  These don't necessarily work well or poorly but
often, at least in the research world, the middle ground is good
enough.  You are after all, talking about people who spit out MySQL
databases at the drop of a hat....

--
Peter Hunsberger

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Research and EAV models
Next
From: Karsten Hilbert
Date:
Subject: Re: Research and EAV models