Thread: SQL Help

SQL Help

From

C F

Date:

30 May 2003, 14:47:09

Hello,

I already tried this same basic question with no response.... maybe I was too wordy. So here it is simplified.... what's the best way to write this query? I'm open to using stored procedures, but even then I don't know how I would conditionally populate a resultset (refcursor). Notice that in the first three cases, the expression is the exact same, only the return value is different. This seems inefficient....

select
(case when column1 = column2 then column3 end) as alias1,
(case when column1 = column2 then column4 end) as alias2,
(case when column1 = column2 then column5 end) as alias3,
(case when column6 = column7 then column8 end) as alias4
from
mytable
;

Any ideas?

Thanks!

Do you Yahoo!?
Free online calendar with sync to Outlook(TM).

Re: SQL Help

From

Bruno Wolff III

Date:

30 May 2003, 15:28:29

On Fri, May 30, 2003 at 08:47:03 -0700, C F <tacnaboyz@yahoo.com> wrote:
> Hello,
> I already tried this same basic question with no response....  maybe I was too wordy.  So here it is simplified....
what'sthe best way to write this query?  I'm open to using stored procedures, but even then I don't know how I would
conditionallypopulate a resultset (refcursor).  Notice that in the first three cases, the expression is the exact same,
onlythe return value is different.  This seems inefficient....
 
>  
> 
> select 
>   (case when column1 = column2 then column3 end) as alias1,
>   (case when column1 = column2 then column4 end) as alias2,
>   (case when column1 = column2 then column5 end) as alias3,
>   (case when column6 = column7 then column8 end) as alias4
> from
>   mytable
> ;

That seems pretty efficient. I wouldn't expect the repeated column1 = column2
tests to add much overhead. Unless there is more to the story this looks
to be what you want to do.

Re: SQL Help

From

Josh Berkus

Date:

30 May 2003, 15:37:44

CF,

> select
>   (case when column1 = column2 then column3 end) as alias1,
>   (case when column1 = column2 then column4 end) as alias2,
>   (case when column1 = column2 then column5 end) as alias3,
>   (case when column6 = column7 then column8 end) as alias4
> from
>   mytable

Given the information you've given us (including your first e-mail), I can't 
think of a better way to get the output you want than the above.  You could 
work through subselects, etc, but I think that would be even less efficient.

One thing you might want to think about is your whole approach to designing 
this application; it may be that you can simplify your queries by changing 
your table design.  

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

Re: CASE returning multiple values (was SQL Help)

From

Richard Huxton

Date:

30 May 2003, 15:40:16

On Friday 30 May 2003 4:47 pm, C F wrote:
> Hello,
> I already tried this same basic question with no response....  maybe I was
> too wordy.

I think it's more a case of nobody seeing a better way.

> select
>   (case when column1 = column2 then column3 end) as alias1,
>   (case when column1 = column2 then column4 end) as alias2,
>   (case when column1 = column2 then column5 end) as alias3,
>   (case when column6 = column7 then column8 end) as alias4

You could write a set returning function, but you'd just end up doing the same
thing. Can you explain what it is you're trying to acheive - real
fields/schemas etc?

--  Richard Huxton

Re: SQL Help

From

Franco Bruno Borghesi

Date:

30 May 2003, 16:05:56

If your concern is speed, the thing here is that you will have as many records
as there are in "mytable", most of them (I think) with NULLs for alias1, alias2,
alias3 and alias4. This way, there is no condition to filter any record, so
postgreSQL will do a sequential scan over the whole table.

If you are ok fetching the records that match and assumming that the all the others
don't match, then the following will work:

--column1=column2
SELECT  column3 AS alias1, column4 AS alias2, column5 AS alias3, NULL AS alias4
FROM  myTable
WHERE  column1=column2
UNION
--column6=column7
SELECT  NULL AS alias1, NULL AS alias2, NULL AS alias3, column8 AS alias4
FROM  myTable
WHERE  column6=column7

Of course, you will need the necesary indexes.

If this didn't give you a hint, please post a message with a link to your original
message, so I can get a better idea of what you need.

On Friday 30 May 2003 12:47, C F wrote:
> Hello,
> I already tried this same basic question with no response....  maybe I was
> too wordy.  So here it is simplified.... what's the best way to write this
> query?  I'm open to using stored procedures, but even then I don't know how
> I would conditionally populate a resultset (refcursor).  Notice that in the
> first three cases, the expression is the exact same, only the return value
> is different.  This seems inefficient....
>
>
> select
>   (case when column1 = column2 then column3 end) as alias1,
>   (case when column1 = column2 then column4 end) as alias2,
>   (case when column1 = column2 then column5 end) as alias3,
>   (case when column6 = column7 then column8 end) as alias4
> from
>   mytable
> ;
>
> Any ideas?
> Thanks!
>
>
> ---------------------------------
> Do you Yahoo!?
> Free online calendar with sync to Outlook(TM).

Re: CASE returning multiple values (was SQL Help)

From

Tom Lane

Date:

30 May 2003, 16:51:55

Richard Huxton <dev@archonet.com> writes:
> On Friday 30 May 2003 4:47 pm, C F wrote:
>> select
>> (case when column1 = column2 then column3 end) as alias1,
>> (case when column1 = column2 then column4 end) as alias2,
>> (case when column1 = column2 then column5 end) as alias3,
>> (case when column6 = column7 then column8 end) as alias4

> Can you explain what it is you're trying to acheive - real 
> fields/schemas etc?

Yeah.  I can't help feeling that this problem really means you've chosen
a bad database schema.

Given the problem as posed, I can only offer one suggestion: you could
avoid writing out the case conditions N times by using a sub-select to
factor out common subexpressions:

select
(case when cond1 then column3 end) as alias1,
(case when cond1 then column4 end) as alias2,
(case when cond1 then column5 end) as alias3,
(case when cond2 then column8 end) as alias4
from
(select column1 = column2 as cond1, column6 = column7 as cond2, column3, column4, ...from ... rest of query as in
original...

) ss;

One should not mistake this for an efficiency improvement, since more
than likely the planner will flatten it into the same querytree as the
original form.  But it might be more readable or easier to code this
way.

But give us some details about your table setup and why you find
yourself needing to do this in the first place.  Maybe there's a 
better design.
        regards, tom lane

Re: CASE returning multiple values (was SQL Help)

From

C F

Date:

30 May 2003, 17:01:22

I was afraid someone was going to ask that :)

Okay, I'll do my best at explaining where I'm coming from....

I'm working on a mapping application it is user-configurable. What this means (as it pertains to this disucssion) is that the through a configuration file, the user is able to define the rules that the application will use to determine which geometries *and* attributes to pull from PG at various scales. These 'rules' that the user defines also contain other specifics such as how to symbolize the geometry, how to label the geometry etc. All of these parameters can either be hard coded by the user into the configuration file, or they can define an expression that will be used to dynamically pull it from the database. On top of all of this, we have two more levels of queries.

So... at the very top level, the user can define an expression that will determine that everything queried from this table will match these criteria... this is my WHERE clause.

Then below this level, various rules can be defined... each rule can have another definition that evaluates into a SQL expression. Now, I could make each rule an entirely separate query, but for one, they all share the exact same top level WHERE clause, and two, there could potentially be many many rules which I would think would cause severe performance issues. Let me give you an example...

Let's say we're mapping cities of the United States based on population... In other words, I want to symbolize the cities on the map based on population (larger symbol for larger populations, smaller symbol for smaller populations, etc). I also want to show the city names of the larger cities *only*. So, what the client application (client to PostgreSQL) needs is; the city location, which rules evaluate to true, and the city names of those larger cities (defined by a rule).

We have a table of cities of the world. So the top level filter (that all rules will share) is, "COUNTRY = 'USA'".

Rule 1 says that cities with a population over 1,000,000 will have a large symbol and be labeled with the city name. So the sql could look like this...

select longitude, latitude, city_name from city where country = 'USA' and population > 1000000;

... seems easy enough, but remember we can have an infinite number of rules (not really inifinite, but you get the point). So....

Rule2 says that cities with a population under 1,000,000 will have a small symbol (note, we do not care about the city name here). So, by itself, the SQL could look like this...

select longitude, latitude from city where country = 'USA' and population < 1000000;

Okay, for this simple example, I would have no problem doing two different queries (this example is extremely simplified compared to what is possible/likely). But what if the user wanted to give a different symbol for every population in 100,000 increments? If our range of populations was 100,000 to 5,000,000 that would be 50 queries! Not only would it be 50 queries, but it would be 50 queries using a nearly identical WHERE clause. So I thought it would be more efficient to combine the queries into something like the following...

select

longitude,

latitutde,

(case when population > 1000000 then true else false end) as rule1,

(case when population > 1000000 then city_name end) as label1,

(case when population < 1000000 then true else false end) as rule2

where

country = 'USA'

;

I could just only concern the SQL with the boolean values for the rules, and return all city names, and let the application simply discard them, but that seems like not a good thing to do for very large resultsets (and again, this is overly simplified, we could have many such columns full of uncessary data being returned). And by the way, that query cannot be written as something like...

(case when population > 1000000 then 'rule1' when population < 1000000 then 'rule2' end) as rules

... because the rules are NOT mutually exclusive, there can many positives.

Anyway, hopefully I didn't leave anything important out. It sounds like there's no obvious solution to avoiding multiple evaluations of the test expressions. The rules are relatively static once the config files are read in, so I could conceivably create stored procedures with a bunch of IF statements at that time. However, I'm not sure if in PG there is a way to dynamically populate the resulting recordset on the fly. I can think of 10 different ways accomplish what I'm trying to do, but hopefully someone has some ideas on what would be the best performing.

Sorry if it's information overload, but you tried to answer my questions, so I thought I should at least try to answer yours :)

Any thoughts much appreciated.

You could write a set returning function, but you'd just end up doing the same
thing. Can you explain what it is you're trying to acheive - real
fields/schemas etc?

--
Richard Huxton

Do you Yahoo!?
Free online calendar with sync to Outlook(TM).

Re: CASE returning multiple values (was SQL Help)

From

Richard Huxton

Date:

02 June 2003, 06:52:04

On Friday 30 May 2003 7:01 pm, C F wrote:
> I was afraid someone was going to ask that :)
> Okay, I'll do my best at explaining where I'm coming from....
[snip long but detailed description I asked for -thanks]

Right - I've done something similar to this before, and I ended up building my
SQL on the fly with something like:

rule_where_cond:rs | rulename | tblname | colname | test | test_value
----+----------|---------+---------+------+----------- 1 |  us-only |    city | country |    = |       USA

Note the "rs" is "rule-set" where all criteria for a specific query have the
same rs value.

rule_sizing_cond:rs | rulename | tblname |    colname |  minval |  maxval |     resultval
----+----------+---------+------------+---------+---------+--------------- 1 | cityicon |    city | population |
0| 1000000 | small_dot.jpg 1 | cityicon |    city | population | 1000001 |99999999 | large_dot.jpg 1 | showname |
city| population |       0 | 1000000 |         false 1 | showname |    city | population | 1000001 |99999999 |
true

This one gains you a lot because you just join to it rather than using CASE
clauses.

rule_select_cols:rs | rulename | tblname |   colname | aliasname
----+----------+---------+-----------+----------- 1 | showname |    city | city_name |    label1

So you're query will now return the city_name as "label1" but you can get a
flag from the "rule_sizing_cond" to say whether to show it or not.

Now, it depends how complicated your conditions can be as to how complicated
the setup of these tables is and also how complicated the query-builder can
be. However, I have used something similar to build queries myself and it
does provide you with a lot of flexibility.

HTH

--  Richard Huxton