Suboptimal evaluation of CASE expressions - Mailing list pgsql-hackers

From Andreas Tille
Subject Suboptimal evaluation of CASE expressions
Date
Msg-id Pine.LNX.4.62.0604111635430.13764@wr-linux02
Whole thread Raw
Responses Re: Suboptimal evaluation of CASE expressions
List pgsql-hackers
Hi,

when dealing with a bug in the postgresql-plr interface I think
I found a suboptimal method to process CASE statements.  First
to the problem:

I'm using the Debian packaged version of PLR version 0.6.2-2 (Debian
testing) and found a problem calculating median from a set of
values that contain only NULL values.  The problem becomes clear
if you look at the following example:

-------------------------------------------------------------------------
$ psql -t test

create table plrtest( nonnull numeric not null, mightbenull numeric, flag int);
insert into plrtest values(42.0, 42.0, 1);
insert into plrtest values(17.0, 17.0, 1);
insert into plrtest values(23.0, NULL, 0 );
insert into plrtest values(4711.0, 4711.0, 1);
insert into plrtest values(174.0, NULL, 0);

CREATE OR REPLACE FUNCTION plr_call_handler()
RETURNS LANGUAGE_HANDLER
AS '\$libdir/plr' LANGUAGE C;

CREATE LANGUAGE plr HANDLER plr_call_handler;

create or replace function r_median(_numeric) returns numeric as 'median(arg1)' 
language 'plr';

CREATE OR REPLACE FUNCTION r_median(_numeric) returns numeric as '  median(arg1)
' language 'plr';

CREATE OR REPLACE FUNCTION plr_array_accum (_numeric, numeric)
RETURNS numeric[]
AS '\$libdir/plr','plr_array_accum'
LANGUAGE 'C';


CREATE AGGREGATE median (  sfunc = plr_array_accum,  basetype = numeric,  stype = _numeric,  finalfunc = r_median
);

SELECT median(nonnull) from plrtest;     42

SELECT median(mightbenull) from plrtest;     42

SELECT median(nonnull), median(mightbenull) from plrtest where flag = 0;
ERROR:  R interpreter expression evaluation error
DETAIL:  Error in median(arg1) : need numeric data
CONTEXT:  In PL/R function r_median

-------------------------------------------------------------------------

I would expect NULL as result of the last query.

So I thought I will verify in a CASE statement whether there
are only NULL values in the column by max(mightbenull) like this:

# SELECT CASE WHEN max(mightbenull) IS NULL THEN 0 ELSE median(mightbenull) END from plrtest where flag = 0;
ERROR:  R interpreter expression evaluation error
DETAIL:  Error in median(arg1) : need numeric data
CONTEXT:  In PL/R function r_median

The problem I want to discuss here is the following:  Usually in
programming languages only one branch of the IF-THEN-ELSE statement
will be calculated.  But here *both* branches are calculated
(obviousely because of the error that occures).  If we just forget
that my goal was to circumvent the error by some hack, I think
if there is some kind of complex query in the ELSE branche that
calculation would just cost extra processing time with no need.
I would regard this as a bug.

Kind regards
      Andreas.

PS: Please CC me. I'm not subscribed.
-- 
http://fam-tille.de


pgsql-hackers by date:

Previous
From: Myron Scott
Date:
Subject: Re: Support Parallel Query Execution in Executor
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Suboptimal evaluation of CASE expressions