Thread: Need sql to pull data from terribly architected table

Need sql to pull data from terribly architected table

From
"Gauthier, Dave"
Date:

Here's the deal...

 

Instead of architecting and loading a table like...  

create teble foo (col1 text, col2 text, col3 text, col4 text, col5 text);  

insert into foo (col1,col2,col3,col4,col5) values ('c1',null,'c3','c4',null);

 

They did this instead...

 

create table foo (property text, value text);

insert into foo (property, value) values ('col1','c1'), ('col3','c3'), ('col4','c4');

 

Notice how "col2" and "col5" were simply left out of the table in the 2nd model to indicate null.

 

The question is how to do this model 1 query for model 2...

 

select col1,col2 from foo where col4='c4' and col5 <> 'xxx';

 

I know I have to use outer joins to deal with the potential of nulls.  But I don't know how to construct this.  I won't list my failed attempts (so as not to embarass myself :-))

 

Thanks in Advance !

 

 

Re: Need sql to pull data from terribly architected table

From
Chris Curvey
Date:

On Tue, Oct 23, 2012 at 2:52 PM, Gauthier, Dave <dave.gauthier@intel.com> wrote:

Here's the deal...

 

Instead of architecting and loading a table like...  

create teble foo (col1 text, col2 text, col3 text, col4 text, col5 text);  

insert into foo (col1,col2,col3,col4,col5) values ('c1',null,'c3','c4',null);

 

They did this instead...

 

create table foo (property text, value text);

insert into foo (property, value) values ('col1','c1'), ('col3','c3'), ('col4','c4');

 

Notice how "col2" and "col5" were simply left out of the table in the 2nd model to indicate null.

 

The question is how to do this model 1 query for model 2...

 

select col1,col2 from foo where col4='c4' and col5 <> 'xxx';

 

I know I have to use outer joins to deal with the potential of nulls.  But I don't know how to construct this.  I won't list my failed attempts (so as not to embarass myself :-))

 

Thanks in Advance !

 

 


assuming there is some other column (I'll call it "id") that tells you which rows in foo go together, then this is a two-stepper:

1) Get the list of all the ids

create temporary table my_ids (id integer);

insert into my_ids (id)
select distinct id from foo;

2) Now go back and get the values

select my_ids.id 
, c1.value as col1
, c2.value as col2
, c3.value as col3
, c4.value as col4
, c5.value as col5
from my_ids
left join foo c1 on my_ids.id = c1.id
left join foo c2 on my_ids.id = c2.id
left join foo c3 on my_ids.id = c3.id
left join foo c4 on my_ids.id = c4.id
left join foo c5 on my_ids.id = c5.id


--
e-Mail is the equivalent of a postcard written in pencil.  This message may not have been sent by me, or intended for you.  It may have been read or even modified while in transit.  e-Mail disclaimers have the same force in law as a note passed in study hall.  If your corporate attorney says that you need an disclaimer in your signature, you need a new corporate attorney.

Re: Need sql to pull data from terribly architected table

From
"David Johnston"
Date:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gauthier, Dave
Sent: Tuesday, October 23, 2012 2:53 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Need sql to pull data from terribly architected table

 

Here's the deal...

 

Instead of architecting and loading a table like...  

create teble foo (col1 text, col2 text, col3 text, col4 text, col5 text);  

insert into foo (col1,col2,col3,col4,col5) values ('c1',null,'c3','c4',null);

 

They did this instead...

 

create table foo (property text, value text);

insert into foo (property, value) values ('col1','c1'), ('col3','c3'), ('col4','c4');

 

Notice how "col2" and "col5" were simply left out of the table in the 2nd model to indicate null.

 

The question is how to do this model 1 query for model 2...

 

select col1,col2 from foo where col4='c4' and col5 <> 'xxx';

 

I know I have to use outer joins to deal with the potential of nulls.  But I don't know how to construct this.  I won't list my failed attempts (so as not to embarass myself :-))

 

Thanks in Advance !

 

 

WITH remap (

SELECT id_field_not_shown_above, v1, v2, v3, v4, v5

FROM (SELECT id_field_not_shown_above FROM foo) master

LEFT JOIN (SELECT id_field_not_shown_above, v1 FROM foo WHERE property = ‘col1’) r1 USING (id_field_not_shown_above)

LEFT JOIN (SELECT id_field_not_shown_above, v2 FROM foo WHERE property = ‘col2’) r2 USING (id_field_not_shown_above)

LEFT JOIN (SELECT id_field_not_shown_above, v3 FROM foo WHERE property = ‘col3’) r3 USING (id_field_not_shown_above)

LEFT JOIN (SELECT id_field_not_shown_above, v4 FROM foo WHERE property = ‘col4’) r4 USING (id_field_not_shown_above)

LEFT JOIN (SELECT id_field_not_shown_above, v5 FROM foo WHERE property = ‘col5’) r5 USING (id_field_not_shown_above)

)

SELCET * FROM remap

 

David J.

 

Re: Need sql to pull data from terribly architected table

From
Richard Broersma
Date:
On Tue, Oct 23, 2012 at 12:06 PM, Chris Curvey <chris@chriscurvey.com> wrote:
select my_ids.id 
, c1.value as col1
, c2.value as col2
, c3.value as col3
, c4.value as col4
, c5.value as col5
from my_ids
left join foo c1 on my_ids.id = c1.id
left join foo c2 on my_ids.id = c2.id
left join foo c3 on my_ids.id = c3.id
left join foo c4 on my_ids.id = c4.id
left join foo c5 on my_ids.id = c5.id

How about:

SELECT my_ids.id, ARRAY_AGG( ( property, valve ) order by property)
  FROM my_ids
ORDER BY id;


--
Regards,
Richard Broersma Jr.

Re: Need sql to pull data from terribly architected table

From
Richard Broersma
Date:


On Tue, Oct 23, 2012 at 12:21 PM, Richard Broersma <richard.broersma@gmail.com> wrote:
On Tue, Oct 23, 2012 at 12:06 PM, Chris Curvey <chris@chriscurvey.com> wrote:
select my_ids.id 
, c1.value as col1
, c2.value as col2
, c3.value as col3
, c4.value as col4
, c5.value as col5
from my_ids
left join foo c1 on my_ids.id = c1.id
left join foo c2 on my_ids.id = c2.id
left join foo c3 on my_ids.id = c3.id
left join foo c4 on my_ids.id = c4.id
left join foo c5 on my_ids.id = c5.id

How about:

oops - I had some malformed air code
 
SELECT my_ids.id, ARRAY_AGG( ( property, value ) order by (property, value) )
  FROM my_ids
   GROUP BY id
ORDER BY id;


--
Regards,
Richard Broersma Jr.



--
Regards,
Richard Broersma Jr.

Re: Need sql to pull data from terribly architected table

From
"Gauthier, Dave"
Date:

Thanks for the answers.  But I also have a predicate...

 

select col1,col2 from foo where col4='c4' and col5 <> 'xxx';

 

How is that done?

 

From: Richard Broersma [mailto:richard.broersma@gmail.com]
Sent: Tuesday, October 23, 2012 3:24 PM
To: chris@chriscurvey.com
Cc: Gauthier, Dave; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Need sql to pull data from terribly architected table

 

 

On Tue, Oct 23, 2012 at 12:21 PM, Richard Broersma <richard.broersma@gmail.com> wrote:

On Tue, Oct 23, 2012 at 12:06 PM, Chris Curvey <chris@chriscurvey.com> wrote:

select my_ids.id 

, c1.value as col1

, c2.value as col2

, c3.value as col3

, c4.value as col4

, c5.value as col5

from my_ids

left join foo c1 on my_ids.id = c1.id

left join foo c2 on my_ids.id = c2.id

left join foo c3 on my_ids.id = c3.id

left join foo c4 on my_ids.id = c4.id

left join foo c5 on my_ids.id = c5.id


How about:

oops - I had some malformed air code
 

SELECT my_ids.id, ARRAY_AGG( ( property, value ) order by (property, value) )
  FROM my_ids

   GROUP BY id

ORDER BY id;


--
Regards,
Richard Broersma Jr.




--
Regards,
Richard Broersma Jr.

Re: Need sql to pull data from terribly architected table

From
"David Johnston"
Date:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Richard Broersma
Sent: Tuesday, October 23, 2012 3:24 PM
To: chris@chriscurvey.com
Cc: Gauthier, Dave; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Need sql to pull data from terribly architected table

 

 

On Tue, Oct 23, 2012 at 12:21 PM, Richard Broersma <richard.broersma@gmail.com> wrote:

On Tue, Oct 23, 2012 at 12:06 PM, Chris Curvey <chris@chriscurvey.com> wrote:

select my_ids.id 

, c1.value as col1

, c2.value as col2

, c3.value as col3

, c4.value as col4

, c5.value as col5

from my_ids

left join foo c1 on my_ids.id = c1.id

left join foo c2 on my_ids.id = c2.id

left join foo c3 on my_ids.id = c3.id

left join foo c4 on my_ids.id = c4.id

left join foo c5 on my_ids.id = c5.id


How about:

oops - I had some malformed air code
 

SELECT my_ids.id, ARRAY_AGG( ( property, value ) order by (property, value) )
  FROM my_ids

   GROUP BY id

ORDER BY id;


--
Regards,
Richard Broersma Jr.




--
Regards,
Richard Broersma Jr.

 

 

The main problem with this code is that the length of the array varies between rows depending on which attributes are missing entries.  You would want to generate dummy records for any missing attributes and then apply the ARRAY_AGG.

 

David J.

 

Re: Need sql to pull data from terribly architected table

From
"David Johnston"
Date:

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gauthier, Dave
Sent: Tuesday, October 23, 2012 3:31 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Need sql to pull data from terribly architected table

 

Thanks for the answers.  But I also have a predicate...

 

select col1,col2 from foo where col4='c4' and col5 <> 'xxx';

 

How is that done?

 

From: Richard Broersma [mailto:richard.broersma@gmail.com]
Sent: Tuesday, October 23, 2012 3:24 PM
To: chris@chriscurvey.com
Cc: Gauthier, Dave; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Need sql to pull data from terribly architected table

 

 

On Tue, Oct 23, 2012 at 12:21 PM, Richard Broersma <richard.broersma@gmail.com> wrote:

On Tue, Oct 23, 2012 at 12:06 PM, Chris Curvey <chris@chriscurvey.com> wrote:

select my_ids.id 

, c1.value as col1

, c2.value as col2

, c3.value as col3

, c4.value as col4

, c5.value as col5

from my_ids

left join foo c1 on my_ids.id = c1.id

left join foo c2 on my_ids.id = c2.id

left join foo c3 on my_ids.id = c3.id

left join foo c4 on my_ids.id = c4.id

left join foo c5 on my_ids.id = c5.id


How about:

oops - I had some malformed air code
 

SELECT my_ids.id, ARRAY_AGG( ( property, value ) order by (property, value) )
  FROM my_ids

   GROUP BY id

ORDER BY id;


--
Regards,
Richard Broersma Jr.




--
Regards,
Richard Broersma Jr.

 

 

You put the above query into a sub-select or CTE (WITH) and then in the outer query you apply whatever where clause you want.

 

If you want to try and help the planner you could do:

 

SELECT some_id FROM foo WHERE property = ‘…’ AND value = ‘…’

UNION ALL

SELECT some_id FROM foo WHERE property = ‘…’ AND value = ‘…’

 

To pre-define which IDs are candidates and then use that information later on in the query.

 

Whether this is a worthwhile effort I have no idea and it may not matter anyway depending on how well the brute-force approach works given your data.

 

David J.

 

Re: Need sql to pull data from terribly architected table

From
Steve Litt
Date:
On Tue, 23 Oct 2012 18:52:52 +0000, Gauthier, Dave said:
> Here's the deal...
>
> Instead of architecting and loading a table like...
> create teble foo (col1 text, col2 text, col3 text, col4 text, col5
> text); insert into foo (col1,col2,col3,col4,col5) values
> ('c1',null,'c3','c4',null);
>
> They did this instead...
>
> create table foo (property text, value text);
> insert into foo (property, value) values ('col1','c1'),
> ('col3','c3'), ('col4','c4');
>
> Notice how "col2" and "col5" were simply left out of the table in the
> 2nd model to indicate null.
>
> The question is how to do this model 1 query for model 2...
>
> select col1,col2 from foo where col4='c4' and col5 <> 'xxx';
>
> I know I have to use outer joins to deal with the potential of
> nulls.  But I don't know how to construct this.  I won't list my
> failed attempts (so as not to embarass myself :-))
>
> Thanks in Advance !


Hi Deve,

If it were me, I'd use a computer language like Perl, Python, Ruby, Lua
or Java to retrieve the data, rather than trying to do the whole thing
in SQL. Looking at the way they constructed their table, I'd guess the
intent was to use a language to do the logic, rather than pure SQL.

The way they constructed the table looks to me like they were trying to
comply with the first normal group -- no repeating columns (no arrays).
This would make sense if col1 through col5 were all the same type of
entity -- for instance, each is the name of a child of the couple. By
doing it as key-value pairs, each couple can have as many or as few
children as necessary, rather than reserving five columns for children
and then running into trouble when a six child family comes along.

Also, with the organization they're using, one can make new "columns"
on the fly. Years ago I created a litigation support database
structured partially as key-value pairs (along with a "row number" -- I
don't know how your database got along without a key to show which row
each key-value pair belonged to). Anyway, the keypuncher is punching
data, comes across a brand new type of data (let's say "artist"), so
for this row the keypuncher puts in a key-value pair of "artist=Lady
Gaga". From a practical point of view, data structure could be change
at key entry time, and needn't have been anticipated by the programmer
nor recompiled or reorganized when a new type of data element entered
the requirements.

I'll bet you dollars to donuts if you could speak to the original
programmer, he'd show you a good reason for his data organization, and
he'd also tell you he in no way anticipated that the data would ever be
handled purely by SQL.

Anyway, bottom line, a simple, procedural language with an interface to
Postgres would be a quick and easy way to convert this data to the type
you prefer.

HTH

SteveT

Steve Litt                *  http://www.troubleshooters.com/
                          *  http://twitter.com/stevelitt
Troubleshooting Training  *  Human Performance



Re: Need sql to pull data from terribly architected table

From
Chris Angelico
Date:
On Thu, Oct 25, 2012 at 2:42 AM, Steve Litt <slitt@troubleshooters.com> wrote:
> Also, with the organization they're using, one can make new "columns"
> on the fly. ... Anyway, the keypuncher is punching
> data, comes across a brand new type of data (let's say "artist"), so
> for this row the keypuncher puts in a key-value pair of "artist=Lady
> Gaga". From a practical point of view, data structure could be change
> at key entry time, and needn't have been anticipated by the programmer
> nor recompiled or reorganized when a new type of data element entered
> the requirements.

That's wonderfully flexible, but it forfeits the protection that a
well-designed schema gives. A system like that is likely to end up
with different records storing the same data under slightly different
names, and you'll have a massive proliferation of "columns" that have
only a single row's value in them. That's fine if that's what you
want, but from a data entry standpoint, I think it's _too_ flexible
for most purposes.

ChrisA


Re: Need sql to pull data from terribly architected table

From
Steve Litt
Date:
On Thu, 25 Oct 2012 03:56:39 +1100, Chris Angelico said:
> On Thu, Oct 25, 2012 at 2:42 AM, Steve Litt
> <slitt@troubleshooters.com> wrote:
> > Also, with the organization they're using, one can make new
> > "columns" on the fly. ... Anyway, the keypuncher is punching
> > data, comes across a brand new type of data (let's say "artist"), so
> > for this row the keypuncher puts in a key-value pair of "artist=Lady
> > Gaga". From a practical point of view, data structure could be
> > change at key entry time, and needn't have been anticipated by the
> > programmer nor recompiled or reorganized when a new type of data
> > element entered the requirements.
>
> That's wonderfully flexible, but it forfeits the protection that a
> well-designed schema gives. A system like that is likely to end up
> with different records storing the same data under slightly different
> names, and you'll have a massive proliferation of "columns" that have
> only a single row's value in them. That's fine if that's what you
> want, but from a data entry standpoint, I think it's _too_ flexible
> for most purposes.

True enough.

In my particular case, my program was used by litigation support
professionals (I probably shouldn't have called them "keypunchers") who
knew what they were doing. Secondly, IIRC, I had a separate table for
field names, so that before creating a new field name, they'd be
presented with the current ones.

One possible way of limiting the possible damage you pointed out would
be to have only a select few be permitted to add new field names. I
think my program might have done that -- I seem to remember those
entering data having to call the supervisor to put in a new category of
information.

But yeah, the data organization I mentioned requires use by halfway
intelligent and competent users, and certainly isn't universally
appropriate.

Thanks

SteveT

Steve Litt                *  http://www.troubleshooters.com/
                          *  http://twitter.com/stevelitt
Troubleshooting Training  *  Human Performance