Hi,
when dealing with a bug in the postgresql-plr interface I think
I found a suboptimal method to process CASE statements. First
to the problem:
I'm using the Debian packaged version of PLR version 0.6.2-2 (Debian
testing) and found a problem calculating median from a set of
values that contain only NULL values. The problem becomes clear
if you look at the following example:
-------------------------------------------------------------------------
$ psql -t test
create table plrtest( nonnull numeric not null, mightbenull numeric, flag int);
insert into plrtest values(42.0, 42.0, 1);
insert into plrtest values(17.0, 17.0, 1);
insert into plrtest values(23.0, NULL, 0 );
insert into plrtest values(4711.0, 4711.0, 1);
insert into plrtest values(174.0, NULL, 0);
CREATE OR REPLACE FUNCTION plr_call_handler()
RETURNS LANGUAGE_HANDLER
AS '\$libdir/plr' LANGUAGE C;
CREATE LANGUAGE plr HANDLER plr_call_handler;
create or replace function r_median(_numeric) returns numeric as 'median(arg1)'
language 'plr';
CREATE OR REPLACE FUNCTION r_median(_numeric) returns numeric as ' median(arg1)
' language 'plr';
CREATE OR REPLACE FUNCTION plr_array_accum (_numeric, numeric)
RETURNS numeric[]
AS '\$libdir/plr','plr_array_accum'
LANGUAGE 'C';
CREATE AGGREGATE median ( sfunc = plr_array_accum, basetype = numeric, stype = _numeric, finalfunc = r_median
);
SELECT median(nonnull) from plrtest; 42
SELECT median(mightbenull) from plrtest; 42
SELECT median(nonnull), median(mightbenull) from plrtest where flag = 0;
ERROR: R interpreter expression evaluation error
DETAIL: Error in median(arg1) : need numeric data
CONTEXT: In PL/R function r_median
-------------------------------------------------------------------------
I would expect NULL as result of the last query.
So I thought I will verify in a CASE statement whether there
are only NULL values in the column by max(mightbenull) like this:
# SELECT CASE WHEN max(mightbenull) IS NULL THEN 0 ELSE median(mightbenull) END from plrtest where flag = 0;
ERROR: R interpreter expression evaluation error
DETAIL: Error in median(arg1) : need numeric data
CONTEXT: In PL/R function r_median
The problem I want to discuss here is the following: Usually in
programming languages only one branch of the IF-THEN-ELSE statement
will be calculated. But here *both* branches are calculated
(obviousely because of the error that occures). If we just forget
that my goal was to circumvent the error by some hack, I think
if there is some kind of complex query in the ELSE branche that
calculation would just cost extra processing time with no need.
I would regard this as a bug.
Kind regards
Andreas.
PS: Please CC me. I'm not subscribed.
--
http://fam-tille.de