Re: obtaining ARRAY position for a given match - Mailing list pgsql-general
From | Pavel Stehule |
---|---|
Subject | Re: obtaining ARRAY position for a given match |
Date | |
Msg-id | 162867790911191033w4516150i130bba5bdf0fcc08@mail.gmail.com Whole thread Raw |
In response to | Re: obtaining ARRAY position for a given match (Sam Mason <sam@samason.me.uk>) |
Responses |
Re: obtaining ARRAY position for a given match
|
List | pgsql-general |
2009/11/19 Sam Mason <sam@samason.me.uk>: > On Thu, Nov 19, 2009 at 05:24:33PM +0100, Pavel Stehule wrote: >> it should be little bit more effective: > > I'm not sure if it will be much more; when you put a set returning > function into a FROM clause PG will always run the function to > completion---as far as I know, but I've only got 8.3 for testing at the > moment. yes, but generate_series is very cheap, this is protection before not necessary string equation. I'm also not sure why you want to return zero when you don't > find the element. The code also exploits an implementation artifact of > PG that the zero (i.e. the RHS of your UNION ALL) will be "after" the > real index. > this is only convention. Somebody like 0 or -1 as result. Somebody can live with NULL. > This raises a small and interesting optimization for PG, when it does > the plan it could notice that a UNION ALL followed by a LIMIT won't need > to return all rows and hence it may be better to run the "quicker" one > first. Or would this end up breaking more code than it helps? > >> CREATE OR REPLACE FUNCTION idx(anyarray, anyelement) >> RETURNS int AS $$ >> SELECT i >> FROM generate_series(array_lover($1,1),array_upper($1,1)) g(i) > > Quality typo :) ^^^ > >> WHERE $1[i] = $2 >> UNION ALL >> SELECT 0 -- return 0 as not found >> LIMIT 1; -- stop after first match >> $$ LANGUAGE sql; > > I'd do something like: > > CREATE OR REPLACE FUNCTION firstidx(anyarray, anyelement) > RETURNS int AS $$ > SELECT i FROM ( > SELECT generate_series(array_lower($1,1),array_upper($1,1))) g(i) > WHERE $1[i] = $2 > LIMIT 1; > $$ LANGUAGE sql IMMUTABLE; > > You can replace the call to array_upper with some large number to check > either function's behavior with large arrays. > your code is very very exactly same as my code. First - there are flattening stage. So if you don't use offset 0, then subselect is transformed to select. I am not sure, if offset 0 should help here - it have to do a materialisation (5ms for 10000 items) more. This function is relative fast: postgres=# select idx(array(select generate_series(1,10000)),10000); idx ------- 10000 (1 row) Time: 40.070 ms maybe - I cannot test it - there could be code CREATE OR REPLACE FUNCTION idx(anyarray, anyelement) RETURNS int AS $$ SELECT i FROM (SELECT generate_subscripts($1) as i, unnest($1) as v) s WHERE v = $2 LIMIT 1; $$ LANGUAGE sql; but I am sure so C code should be faster Regards Pavel Stehule > -- > Sam http://samason.me.uk/ > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
pgsql-general by date: