Thread: problem with a conditional statement

problem with a conditional statement

From
Kirk Wythers
Date:
I am struggling to get a CASE WHEN statement to work within another
CASE WHEN. Here is my original code:

SELECT CASE WHEN w.station_id = site_near.station_id THEN w.obs_id ELSE
s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
w.year, w.doy, w.precip, w.tmin, w.tmax,

--replace missing solar values (-999) with the average of all solar
values from that month (s.month)

--CASE  s.par WHEN -999 THEN AVG( s.par) ELSE s.par END
--FROM solar s
--GROUP BY s.month;

FROM site_near INNER JOIN solar s ON
site_near.ref_solar_station_id = s.station_id  AND site_near.obs_year
= s.year
INNER JOIN weather w ON site_near.ref_weather_station_id =
w.station_id AND site_near.obs_year = w.year AND s.date = w.date
WHERE w.station_id = 211630;

I have commented out the troublesome bits in the middle of the code.
All I am trying to do here is to replace missing values with averages
from the same day of the year for all years. Does anyone see what I
am buggering up here?

Thanks in advance

Re: problem with a conditional statement

From
"Albe Laurenz"
Date:
Kirk Wythers wrote:

> I am struggling to get a CASE WHEN statement to work within another
> CASE WHEN. Here is my original code:
>
> SELECT CASE WHEN w.station_id = site_near.station_id THEN w.obs_id
ELSE
> s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
> w.year, w.doy, w.precip, w.tmin, w.tmax,
>
> --replace missing solar values (-999) with the average of all solar
> --values from that month (s.month)
>
> --CASE  s.par WHEN -999 THEN AVG( s.par) ELSE s.par END
> --FROM solar s
> --GROUP BY s.month;
>
> FROM site_near INNER JOIN solar s ON
> site_near.ref_solar_station_id = s.station_id  AND
> site_near.obs_year = s.year
> INNER JOIN weather w ON site_near.ref_weather_station_id =
> w.station_id AND site_near.obs_year = w.year AND s.date = w.date
> WHERE w.station_id = 211630;
>
> I have commented out the troublesome bits in the middle of the code.
> All I am trying to do here is to replace missing values with averages

> from the same day of the year for all years. Does anyone see what I
> am buggering up here?

The problem here is the AVG().
All columns that appear outside of group functions in the SELECT list
must be in the GROUP BY clause.

Maybe something like this could help you:

SELECT ..., w.tmax,
  CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END
  ...
  FROM solar s INNER JOIN ...,
    (SELECT month, AVG(par) FROM solar GROUP BY month) AS avgsol
  WHERE s.month = avgsol.month
    AND ...

In this statement I create a subselect "avgsol" that I use like
a table.

Be warned that there will probably be a sequential scan of the whole
table "solar" whenever you run the statement, because the averages have
to be calculated first!

Yours,
Laurenz Albe

Re: problem with a conditional statement

From
Kirk Wythers
Date:
On May 8, 2007, at 2:02 AM, Albe Laurenz wrote:

> Kirk Wythers wrote:
>
>> I am struggling to get a CASE WHEN statement to work within another
>> CASE WHEN. Here is my original code:
>>
>> SELECT CASE WHEN w.station_id = site_near.station_id THEN w.obs_id
> ELSE
>> s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
>> w.year, w.doy, w.precip, w.tmin, w.tmax,
>>
>> --replace missing solar values (-999) with the average of all solar
>> --values from that month (s.month)
>>
>> --CASE  s.par WHEN -999 THEN AVG( s.par) ELSE s.par END
>> --FROM solar s
>> --GROUP BY s.month;
>>
>> FROM site_near INNER JOIN solar s ON
>> site_near.ref_solar_station_id = s.station_id  AND
>> site_near.obs_year = s.year
>> INNER JOIN weather w ON site_near.ref_weather_station_id =
>> w.station_id AND site_near.obs_year = w.year AND s.date = w.date
>> WHERE w.station_id = 211630;
>>
>> I have commented out the troublesome bits in the middle of the code.
>> All I am trying to do here is to replace missing values with averages
>
>> from the same day of the year for all years. Does anyone see what I
>> am buggering up here?
>

Thank you for the reply. I see what you are doing in the creating of
avgsol. That should work perfectly. However, I am unsure how you are
working it into the existing code.

> The problem here is the AVG().
> All columns that appear outside of group functions in the SELECT list
> must be in the GROUP BY clause.
>
> Maybe something like this could help you:
>
> SELECT ..., w.tmax,

I think you adding "CASE s.par WHEN -999 THEN avgsol.par ELSE s.par
END" after

"SELECT CASE WHEN w.station_id = site_near.station_id THEN w.obs_id ELSE
s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
w.year, w.doy, w.precip, w.tmin, w.tmax,"

to look this like this:

SELECT CASE WHEN w.station_id = site_near.station_id THEN w.obs_id ELSE
s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
w.year, w.doy, w.precip, w.tmin, w.tmax,
--replace missing values (-999) with the monthly average
CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END

Correct?

>   CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END
>   ...
>   FROM solar s INNER JOIN ...,

I can't quite figure out what you are suggesting here?

>     (SELECT month, AVG(par) FROM solar GROUP BY month) AS avgsol
>   WHERE s.month = avgsol.month
>     AND ...

Do you mean:

FROM site_near INNER JOIN solar s ON
(SELECT month, AVG(par) FROM solar GROUP BY month) AS avgsol WHERE
s.month = avgsol.month
AND site_near.ref_solar_station_id = s.station_id  AND
site_near.obs_year = s.year
INNER JOIN weather w ON site_near.ref_weather_station_id =
w.station_id AND site_near.obs_year = w.year AND s.date = w.date
WHERE w.station_id = 211630;

I think my trouble is figuring how to place the code snipit:

    (SELECT month, AVG(par) FROM solar GROUP BY month) AS avgsol
   WHERE s.month = avgsol.month
     AND ...

Sorry for being so dull



>
> In this statement I create a subselect "avgsol" that I use like
> a table.
>
> Be warned that there will probably be a sequential scan of the whole
> table "solar" whenever you run the statement, because the averages
> have
> to be calculated first!

That is ok, I won't be running this query so often that the
performance will be an issue.




Re: problem with a conditional statement

From
"Albe Laurenz"
Date:
Kirk Wythers wrote:
>>> I am struggling to get a CASE WHEN statement to work within another
>>> CASE WHEN. Here is my original code:
>>>
>>> SELECT CASE WHEN w.station_id = site_near.station_id THEN w.obs_id
ELSE
>>> s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
>>> w.year, w.doy, w.precip, w.tmin, w.tmax,
>>>
>>> --replace missing solar values (-999) with the average of all solar
>>> --values from that month (s.month)
>>>
>>> --CASE  s.par WHEN -999 THEN AVG( s.par) ELSE s.par END
>>> --FROM solar s
>>> --GROUP BY s.month;
>>>
>>> FROM site_near INNER JOIN solar s ON
>>> site_near.ref_solar_station_id = s.station_id  AND
>>> site_near.obs_year = s.year
>>> INNER JOIN weather w ON site_near.ref_weather_station_id =
>>> w.station_id AND site_near.obs_year = w.year AND s.date = w.date
>>> WHERE w.station_id = 211630;
>>
> Thank you for the reply. I see what you are doing in the creating of
> avgsol. That should work perfectly. However, I am unsure how you are
> working it into the existing code.

I did not provide the complete statement because
a) I am lazy and
b) I didn't want to create the impression that it was bulletproof
   tested SQL :^)

> to look this like this:
>
> SELECT CASE WHEN w.station_id = site_near.station_id THEN
> w.obs_id ELSE
> s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
> w.year, w.doy, w.precip, w.tmin, w.tmax,
> --replace missing values (-999) with the monthly average
> CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END
>
> Correct?

Yes!

>>   FROM solar s INNER JOIN ...,
>
> I can't quite figure out what you are suggesting here?
>
>>     (SELECT month, AVG(par) FROM solar GROUP BY month) AS avgsol
>>   WHERE s.month = avgsol.month
>>     AND ...
>
> Do you mean:
>
> FROM site_near INNER JOIN solar s ON
> (SELECT month, AVG(par) FROM solar GROUP BY month) AS avgsol WHERE
> s.month = avgsol.month
> AND site_near.ref_solar_station_id = s.station_id  AND
> site_near.obs_year = s.year
> INNER JOIN weather w ON site_near.ref_weather_station_id =
> w.station_id AND site_near.obs_year = w.year AND s.date = w.date
> WHERE w.station_id = 211630;
>
> I think my trouble is figuring how to place the code snipit:
>
>     (SELECT month, AVG(par) FROM solar GROUP BY month) AS avgsol
>    WHERE s.month = avgsol.month
>      AND ...
>
> Sorry for being so dull

Sorry for being so lazy :^)

Here is a more elaborate version, I'm trying to add 'avgsol' to
your original FROM clause:

FROM site_near
  INNER JOIN solar s ON
    (site_near.ref_solar_station_id = s.station_id
     AND site_near.obs_year = s.year)
  INNER JOIN weather w ON
    (site_near.ref_weather_station_id = w.station_id
     AND site_near.obs_year = w.year
     AND s.date = w.date)
  INNER JOIN (SELECT month, AVG(par) FROM solar GROUP BY month) AS
avgsol ON
    (s.month = avgsol.month)
WHERE ...

Still no claim for correctness.

Does it make more sense now?

Yours,
Laurenz Albe

Re: problem with a conditional statement

From
Kirk Wythers
Date:
>
> Here is a more elaborate version, I'm trying to add 'avgsol' to
> your original FROM clause:

> SELECT CASE WHEN w.station_id = site_near.station_id THEN
> w.obs_id ELSE
> s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
> w.year, w.doy, w.precip, w.tmin, w.tmax,
> --replace missing values (-999) with the monthly average
> CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END

>
> FROM site_near
>   INNER JOIN solar s ON
>     (site_near.ref_solar_station_id = s.station_id
>      AND site_near.obs_year = s.year)
>   INNER JOIN weather w ON
>     (site_near.ref_weather_station_id = w.station_id
>      AND site_near.obs_year = w.year
>      AND s.date = w.date)
>   INNER JOIN (SELECT month, AVG(par) FROM solar GROUP BY month) AS
> avgsol ON
>     (s.month = avgsol.month)
> WHERE ...
>
> Still no claim for correctness.
>
> Does it make more sense now?

Thanks again Laurenz. Hopefully I have nearly figured this out. I
have a question that indicates to me that I am a little fuzzy on one
more point.

In the line: "CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END"

I am getting an error that says, "column avgsol.par does not exist".
I understand that you are creating avgsol as a subselect, but I also
see the point of the error message that the column .par does not
exist. If I change avgsol.par to the simple form avgsol (to match the
subselect " INNER JOIN (SELECT month, AVG(par) FROM solar GROUP BY
month) AS avgsol ON (s.month = avgsol.month)". Then I get an error
about CASE types real and record cannot be matched. Any final ideas?

>
> Yours,
> Laurenz Albe


Re: problem with a conditional statement

From
Erik Jones
Date:
On May 9, 2007, at 10:32 AM, Kirk Wythers wrote:

>
>>
>> Here is a more elaborate version, I'm trying to add 'avgsol' to
>> your original FROM clause:
>
>> SELECT CASE WHEN w.station_id = site_near.station_id THEN
>> w.obs_id ELSE
>> s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
>> w.year, w.doy, w.precip, w.tmin, w.tmax,
>> --replace missing values (-999) with the monthly average
>> CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END
>
>>
>> FROM site_near
>>   INNER JOIN solar s ON
>>     (site_near.ref_solar_station_id = s.station_id
>>      AND site_near.obs_year = s.year)
>>   INNER JOIN weather w ON
>>     (site_near.ref_weather_station_id = w.station_id
>>      AND site_near.obs_year = w.year
>>      AND s.date = w.date)
>>   INNER JOIN (SELECT month, AVG(par) FROM solar GROUP BY month) AS
>> avgsol ON
>>     (s.month = avgsol.month)
>> WHERE ...
>>
>> Still no claim for correctness.
>>
>> Does it make more sense now?
>
> Thanks again Laurenz. Hopefully I have nearly figured this out. I
> have a question that indicates to me that I am a little fuzzy on
> one more point.
>
> In the line: "CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END"
>
> I am getting an error that says, "column avgsol.par does not
> exist". I understand that you are creating avgsol as a subselect,
> but I also see the point of the error message that the column .par
> does not exist. If I change avgsol.par to the simple form avgsol
> (to match the subselect " INNER JOIN (SELECT month, AVG(par) FROM
> solar GROUP BY month) AS avgsol ON (s.month = avgsol.month)". Then
> I get an error about CASE types real and record cannot be matched.
> Any final ideas?

The "AVG(par)" should've been aliased.  "AVG(par) as par" would
work.  As is, the column name returned is just "avg".

erik jones <erik@myemma.com>
software developer
615-296-0838
emma(r)




Re: problem with a conditional statement

From
Kirk Wythers
Date:
On May 9, 2007, at 10:41 AM, Erik Jones wrote:

>
> On May 9, 2007, at 10:32 AM, Kirk Wythers wrote:
>
>>
>>>
>>> Here is a more elaborate version, I'm trying to add 'avgsol' to
>>> your original FROM clause:
>>
>>> SELECT CASE WHEN w.station_id = site_near.station_id THEN
>>> w.obs_id ELSE
>>> s.obs_id END AS obs_id,  site_near.station_id, site_near.longname,
>>> w.year, w.doy, w.precip, w.tmin, w.tmax,
>>> --replace missing values (-999) with the monthly average
>>> CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END
>>
>>>
>>> FROM site_near
>>>   INNER JOIN solar s ON
>>>     (site_near.ref_solar_station_id = s.station_id
>>>      AND site_near.obs_year = s.year)
>>>   INNER JOIN weather w ON
>>>     (site_near.ref_weather_station_id = w.station_id
>>>      AND site_near.obs_year = w.year
>>>      AND s.date = w.date)
>>>   INNER JOIN (SELECT month, AVG(par) FROM solar GROUP BY month) AS
>>> avgsol ON
>>>     (s.month = avgsol.month)
>>> WHERE ...
>>>
>>> Still no claim for correctness.
>>>
>>> Does it make more sense now?
>>
>> Thanks again Laurenz. Hopefully I have nearly figured this out. I
>> have a question that indicates to me that I am a little fuzzy on
>> one more point.
>>
>> In the line: "CASE s.par WHEN -999 THEN avgsol.par ELSE s.par END"
>>
>> I am getting an error that says, "column avgsol.par does not
>> exist". I understand that you are creating avgsol as a subselect,
>> but I also see the point of the error message that the column .par
>> does not exist. If I change avgsol.par to the simple form avgsol
>> (to match the subselect " INNER JOIN (SELECT month, AVG(par) FROM
>> solar GROUP BY month) AS avgsol ON (s.month = avgsol.month)". Then
>> I get an error about CASE types real and record cannot be matched.
>> Any final ideas?
>
> The "AVG(par)" should've been aliased.  "AVG(par) as par" would
> work.  As is, the column name returned is just "avg".

Got it! Thankyou!


>
> erik jones <erik@myemma.com>
> software developer
> 615-296-0838
> emma(r)
>
>
>