Thread: Creating a non-strict custom aggregate that initializes to the first value
Creating a non-strict custom aggregate that initializes to the first value
From
Timothy Garnett
Date:
Hi all,
I'm trying to create a custom aggregate function that returns the value from the first row (possibly null).
For ex.
Table t
a | b
-----
1 | A
2 | NULL
SELECT my_first(b order by a) => A
SELECT my_first(b order by a DESC) => NULL
The straightforward way would seem to be something like
CREATE OR REPLACE FUNCTION public.first_agg ( anyelement, anyelement )
RETURNS anyelement LANGUAGE sql IMMUTABLE STRICT AS $$
SELECT $1;
$$;
but if that is declared strict then it would take the first non-null value and return A in my second example, if declared non-strict then the initial state would be fed as null rather then the first value. Is there a way to declare the function non-strict (so that null values are passed) but still have it initialize to the first value like it would if it was strict?
Thanks for the help,
Tim
Re: Creating a non-strict custom aggregate that initializes to the first value
From
"David G. Johnston"
Date:
but if that is declared strict then it would take the first non-null value and return A in my second example, if declared non-strict then the initial state would be fed as null rather then the first value. Is there a way to declare the function non-strict (so that null values are passed) but still have it initialize to the first value like it would if it was strict?
Late night pondering here but...
Because of the way SQL null works, and your desired to handle "anyelement", you are stuck determining whether you are currently evaluating the first row of your input - or not. For the first row you always take the "new" value while for all subsequent rows you take the "state" value. So, your state needs to encompass both "prior row number" and "active value", which suggests you need to create a custom type for your state variable.
You want NULL to both mean "not initialized" and "unknown value" which is impossible and SQL does not provide any other universal literal that means one or the other.
I'm not sure how you deal with "anyelement" in a custom type that could be used as a state variable...
David J.
Re: Creating a non-strict custom aggregate that initializes to the first value
From
Francisco Olarte
Date:
Hi Timothy: On Thu, Mar 26, 2015 at 9:49 PM, Timothy Garnett <tgarnett@panjiva.com> wrote: > I'm trying to create a custom aggregate function that returns the value from > the first row (possibly null). ... > The straightforward way would seem to be something like > > CREATE OR REPLACE FUNCTION public.first_agg ( anyelement, anyelement ) > RETURNS anyelement LANGUAGE sql IMMUTABLE STRICT AS $$ > SELECT $1; > $$; > but if that is declared strict then it would take the first non-null value > and return A in my second example, if declared non-strict then the initial > state would be fed as null rather then the first value. Is there a way to > declare the function non-strict (so that null values are passed) but still > have it initialize to the first value like it would if it was strict? First, your pasted code is incomplete, you should post all relevant code ( I suspect a highly relevant create aggregate statement is missing ). Next, it's already been pointed you are trying to use null for two meanings, which is not gonna work. I think your problem is you are using the one function form of aggregates. Aggregates, in postgres and lots of other places, are done with two functions, an state transition function called for each row ( more or less, strictness asside ) and a final function. Think on average, you need an state consisting of sum, count, and state transition function adding to sum and incremementing count and a final function to do the division. Generally, you start with STATE_TYPE state=initial_state; foreach value state=transition(state, current_val); AGG_TYPE agg_value = final(state). STATE_TYPE and AGG_TYPE need not be the same. When STATE_TYPE and AGG_TYPE are the same many systems, postgres included, have a shortcut for using just an state transition function , using indentity as the final function, as this is adequate for things like count, min/max, sum, and this seems to be the variant you are using. In your case you should probably use the two form agregate, use a composite type for the state ( first_time, val ), an state transition function ( if ft true store val, else skip ) and a final function ( return val from state ) ( there are posibly other methods, like using an array for state, initializing to empty, storing val if empty or skipping, just play around ). Also, you may be having an http://xyproblem.info/ , do you want an aggregate to return the first value, or a function?. ( I suspect it is the first case, as you are pasting a create function for first_agg and a select for my_first, but not having posted a complete example makes it very hard ). Regards. Francisco Olarte.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Thu, Mar 26, 2015 at 1:49 PM, Timothy Garnett <tgarnett@panjiva.com> > wrote: >> but if that is declared strict then it would take the first non-null value >> and return A in my second example, if declared non-strict then the initial >> state would be fed as null rather then the first value. Is there a way to >> declare the function non-strict (so that null values are passed) but still >> have it initialize to the first value like it would if it was strict? > You want NULL to both mean "not initialized" and "unknown value" which is > impossible and SQL does not provide any other universal literal that means > one or the other. Yeah. You need distinct representations for "nothing seen yet" and "saw a NULL"; the built-in behavior doesn't suffice for this. One idea is for the state value to be of anyarray type: initially null, and a one-element array containing the first input value once you've seen that. It strikes me though that this aggregate is ill-defined by nature. In particular, if you're going to treat NULL as being a real data value, then what're you gonna return when there were no input rows? You won't be able to distinguish "no input rows" from "first input row had a NULL". Maybe you should rethink whatever activity you were wanting it for. regards, tom lane