Re: Need sql to pull data from terribly architected table - Mailing list pgsql-general

From Steve Litt
Subject Re: Need sql to pull data from terribly architected table
Date
Msg-id 20121024114221.3f8e5392@mydesk
Whole thread Raw
In response to Need sql to pull data from terribly architected table  ("Gauthier, Dave" <dave.gauthier@intel.com>)
Responses Re: Need sql to pull data from terribly architected table  (Chris Angelico <rosuav@gmail.com>)
List pgsql-general
On Tue, 23 Oct 2012 18:52:52 +0000, Gauthier, Dave said:
> Here's the deal...
>
> Instead of architecting and loading a table like...
> create teble foo (col1 text, col2 text, col3 text, col4 text, col5
> text); insert into foo (col1,col2,col3,col4,col5) values
> ('c1',null,'c3','c4',null);
>
> They did this instead...
>
> create table foo (property text, value text);
> insert into foo (property, value) values ('col1','c1'),
> ('col3','c3'), ('col4','c4');
>
> Notice how "col2" and "col5" were simply left out of the table in the
> 2nd model to indicate null.
>
> The question is how to do this model 1 query for model 2...
>
> select col1,col2 from foo where col4='c4' and col5 <> 'xxx';
>
> I know I have to use outer joins to deal with the potential of
> nulls.  But I don't know how to construct this.  I won't list my
> failed attempts (so as not to embarass myself :-))
>
> Thanks in Advance !


Hi Deve,

If it were me, I'd use a computer language like Perl, Python, Ruby, Lua
or Java to retrieve the data, rather than trying to do the whole thing
in SQL. Looking at the way they constructed their table, I'd guess the
intent was to use a language to do the logic, rather than pure SQL.

The way they constructed the table looks to me like they were trying to
comply with the first normal group -- no repeating columns (no arrays).
This would make sense if col1 through col5 were all the same type of
entity -- for instance, each is the name of a child of the couple. By
doing it as key-value pairs, each couple can have as many or as few
children as necessary, rather than reserving five columns for children
and then running into trouble when a six child family comes along.

Also, with the organization they're using, one can make new "columns"
on the fly. Years ago I created a litigation support database
structured partially as key-value pairs (along with a "row number" -- I
don't know how your database got along without a key to show which row
each key-value pair belonged to). Anyway, the keypuncher is punching
data, comes across a brand new type of data (let's say "artist"), so
for this row the keypuncher puts in a key-value pair of "artist=Lady
Gaga". From a practical point of view, data structure could be change
at key entry time, and needn't have been anticipated by the programmer
nor recompiled or reorganized when a new type of data element entered
the requirements.

I'll bet you dollars to donuts if you could speak to the original
programmer, he'd show you a good reason for his data organization, and
he'd also tell you he in no way anticipated that the data would ever be
handled purely by SQL.

Anyway, bottom line, a simple, procedural language with an interface to
Postgres would be a quick and easy way to convert this data to the type
you prefer.

HTH

SteveT

Steve Litt                *  http://www.troubleshooters.com/
                          *  http://twitter.com/stevelitt
Troubleshooting Training  *  Human Performance



pgsql-general by date:

Previous
From: Chris Angelico
Date:
Subject: Re: Plug-pull testing worked, diskchecker.pl failed
Next
From: salah jubeh
Date:
Subject: Re: Postgresql high available solution