Thread: Converting each item in array to a query result row
Hi, I'd writing a query against a function (pg_proc) that contains 2 fields of an array type. Ideally, I'd like to select 1 row from the table, but return a query row for each item in the array. For example, if one row contains the array {"a", "b", "c"} I'd like the query to return 3 rows, one for each of these elements. Any idea if this is possible? Thanks.
----- Original Message -----
From: "Postgres User" <postgres.developer@gmail.com>
To: "pgsql-general" <pgsql-general@postgresql.org>
Sent: Friday, May 29, 2009 12:21:11 AM GMT -08:00 Tijuana / Baja California
Subject: [GENERAL] Converting each item in array to a query result row
Hi,
I'd writing a query against a function (pg_proc) that contains 2
fields of an array type. Ideally, I'd like to select 1 row from the
table, but return a query row for each item in the array.
For example, if one row contains the array {"a", "b", "c"}
I'd like the query to return 3 rows, one for each of these elements.
Any idea if this is possible?
Not sure exactly what you want, but this should get you in the ball park
SELECT f.my_field[i] AS value,
(array['I am one','I am two','I am three'])[i] AS description
FROM my_func(foo) f
JOIN generate_series(1,3) i ON 1=1
Postgres 8.4 will has an unpack() function to convert an array to a set. Pavel has a write up about doing this in 8.3 and lower here:
http://www.postgres.cz/index.php/PostgreSQL_SQL_Tricks#Array_to_table
I've got some examples of pulling data ouf of arrays here:
http://scottrbailey.wordpress.com/2009/05/20/etl-with-postgres-arrays/
Postgres User wrote: > Hi, > > I'd writing a query against a function (pg_proc) that contains 2 > fields of an array type. Ideally, I'd like to select 1 row from the > table, but return a query row for each item in the array. > > For example, if one row contains the array {"a", "b", "c"} > I'd like the query to return 3 rows, one for each of these elements. > > Any idea if this is possible? > > Thanks. > No matter how you create your sub query results, you still have to create a sub result record by record. Perhaps the following helps: Note that the input parameter is not an array but a string that looks like an array ------------------------------------------------------------------ create or replace function convert_to_query(p_array varchar) returns setof record as $$ declare result record; begin return query select data.idx[enumerator.counter]::varchar from generate_series(1,array_upper(string_to_array(p_array,','),1)) as enumerator(counter), string_to_array(p_array,',') as data(idx); end; $$ language plpgsql; select result.field1 from convert_to_query('a,c,b,d,e,f') as result(field1 varchar); ----------------------------------------------------------- -- Regards, Gevik
I needed to do this just the other day, here's what I did:
create or replace function explode(_a anyarray) returns setof anyelement as $$
begin
for i in array_lower(_a,1) .. array_upper(_a,1) loop
return next _a[i];
end loop;
return;
end;
$$ language plpgsql strict immutable;
select * from explode((select col from table where id = whatever)) as Exploded;
On 29/05/2009, at 6:52 PM, artacus@comcast.net wrote:
----- Original Message -----
From: "Postgres User" <postgres.developer@gmail.com>
To: "pgsql-general" <pgsql-general@postgresql.org>
Sent: Friday, May 29, 2009 12:21:11 AM GMT -08:00 Tijuana / Baja California
Subject: [GENERAL] Converting each item in array to a query result row
Hi,
I'd writing a query against a function (pg_proc) that contains 2
fields of an array type. Ideally, I'd like to select 1 row from the
table, but return a query row for each item in the array.
For example, if one row contains the array {"a", "b", "c"}
I'd like the query to return 3 rows, one for each of these elements.
Any idea if this is possible?
Not sure exactly what you want, but this should get you in the ball parkSELECT f.my_field[i] AS value,
(array['I am one','I am two','I am three'])[i] AS description
FROM my_func(foo) f
JOIN generate_series(1,3) i ON 1=1Postgres 8.4 will has an unpack() function to convert an array to a set. Pavel has a write up about doing this in 8.3 and lower here:I've got some examples of pulling data ouf of arrays here:
why complicate so much ? this is a single sql query. It really shocks me, how people easily lean on plpgsql and for/loops - which are times slower than simple sql query: CREATE FUNCTION unnest(anyarray) RETURNS SETOF anyelement AS $_$ SELECT ($1)[i] FROM generate_series(array_lower($1,1),array_upper($1,1)) i; $_$ LANGUAGE sql IMMUTABLE;
Always test your performance assumptions. The plpgsql function is faster than the sql function, a lot faster on smaller arrays. unnest - 10 element array - 100,000 times: 6701.746 ms unnest - 100 element array - 100,000 times: 11847.933 ms unnest - 1000 element array - 100,000 times: 59472.691 ms explode - 10 element array - 100,000 times: 1941.942 ms explode - 100 element array - 100,000 times: 8521.289 ms explode - 1000 element array - 100,000 times: 44980.048 ms On 29/05/2009, at 8:55 PM, Grzegorz Jaśkiewicz wrote: > why complicate so much ? this is a single sql query. It really shocks > me, how people easily lean on plpgsql and for/loops - which are times > slower than simple sql query: > > CREATE FUNCTION unnest(anyarray) RETURNS SETOF anyelement AS > $_$ > SELECT ($1)[i] FROM > generate_series(array_lower($1,1),array_upper($1,1)) i; > $_$ > LANGUAGE sql IMMUTABLE;
2009/5/29 Adam Ruth <adamruth@mac.com>: > Always test your performance assumptions. The plpgsql function is faster > than the sql function, a lot faster on smaller arrays. > > unnest - 10 element array - 100,000 times: 6701.746 ms > unnest - 100 element array - 100,000 times: 11847.933 ms > unnest - 1000 element array - 100,000 times: 59472.691 ms > > explode - 10 element array - 100,000 times: 1941.942 ms > explode - 100 element array - 100,000 times: 8521.289 ms > explode - 1000 element array - 100,000 times: 44980.048 ms > o.O -- GJ
2009/5/29 Grzegorz Jaśkiewicz <gryzman@gmail.com>: > 2009/5/29 Adam Ruth <adamruth@mac.com>: >> Always test your performance assumptions. The plpgsql function is faster >> than the sql function, a lot faster on smaller arrays. >> >> unnest - 10 element array - 100,000 times: 6701.746 ms >> unnest - 100 element array - 100,000 times: 11847.933 ms >> unnest - 1000 element array - 100,000 times: 59472.691 ms >> >> explode - 10 element array - 100,000 times: 1941.942 ms >> explode - 100 element array - 100,000 times: 8521.289 ms >> explode - 1000 element array - 100,000 times: 44980.048 ms >> > Version 8.4 has an internal function unnest(anyarray): http://developer.postgresql.org/pgdocs/postgres/functions-array.html Osvaldo
Adam Ruth <adamruth@mac.com> writes: > I needed to do this just the other day, here's what I did: > create or replace function explode(_a anyarray) returns setof > anyelement as $$ You'd be well advised to call this unnest(), because that's what the equivalent built-in function in 8.4 is called ;-) regards, tom lane
CREATE OR REPLACE FUNCTION explode(_a anyarray) returns setof anyelement AS $_$ BEGIN RAISE NOTICE 'poof... just kidding... kabooom'; RETURN unnest(_a); END; $_$ LANGUAGE 'plpgsql' ;] seriously, the 8.4 version is written in C, so I will be really surprised if it under performs plpgsql one .) -- GJ
Adam Ruth <adamruth@mac.com> writes: > Always test your performance assumptions. The plpgsql function is > faster than the sql function, a lot faster on smaller arrays. And, of course, it also pays to be precise about what you're testing and on what. Set-returning SQL functions got a lot faster in 8.4. Using CVS HEAD on a not-very-fast machine, I get these timings for the attached script (10000 loop iterations in all cases) 10 elements 100 elements 1000 elements built-in unnest 2.44 6.52 47.96 SQL function 2.52 6.50 46.71 plpgsql function 3.63 12.47 101.68 So at least in this specific test condition, there's not much perceptible difference between the SQL function and the builtin, while plpgsql lags behind. regards, tom lane create or replace function testit(n int, l int) returns float8 as $$ declare arr int[]; st timestamptz; et timestamptz; begin arr := '{}'; for i in 1 .. n loop arr[i] = i; end loop; st := clock_timestamp(); for i in 1 .. l loop perform count(*) from unnest(arr); -- or unnest_sql or unnest_plpgsql end loop; et := clock_timestamp(); return extract(epoch from et - st); end $$ language plpgsql; CREATE or replace FUNCTION unnest_sql(anyarray) RETURNS SETOF anyelement AS $_$ SELECT ($1)[i] FROM generate_series(array_lower($1,1),array_upper($1,1)) i; $_$ LANGUAGE sql IMMUTABLE; create or replace function unnest_plpgsql(_a anyarray) returns setof anyelement as $$ begin for i in array_lower(_a,1) .. array_upper(_a,1) loop return next _a[i]; end loop; return; end; $$ language plpgsql strict immutable;
Good point, I should have specified 8.3.7. Just one more reason to anxiously anticipate upgrading to 8.4. On 30/05/2009, at 2:56 AM, Tom Lane wrote: > Adam Ruth <adamruth@mac.com> writes: >> Always test your performance assumptions. The plpgsql function is >> faster than the sql function, a lot faster on smaller arrays. > > And, of course, it also pays to be precise about what you're testing > and on what. Set-returning SQL functions got a lot faster in 8.4. > Using CVS HEAD on a not-very-fast machine, I get these timings for > the attached script (10000 loop iterations in all cases) > > 10 elements 100 elements 1000 elements > > built-in unnest 2.44 6.52 47.96 > SQL function 2.52 6.50 46.71 > plpgsql function 3.63 12.47 101.68 > > So at least in this specific test condition, there's not much > perceptible difference between the SQL function and the builtin, > while plpgsql lags behind. > > regards, tom lane > > > create or replace function testit(n int, l int) returns float8 as $$ > declare arr int[]; > st timestamptz; > et timestamptz; > begin > arr := '{}'; > for i in 1 .. n loop > arr[i] = i; > end loop; > st := clock_timestamp(); > for i in 1 .. l loop > perform count(*) from unnest(arr); -- or unnest_sql or > unnest_plpgsql > end loop; > et := clock_timestamp(); > return extract(epoch from et - st); > end $$ language plpgsql; > > CREATE or replace FUNCTION unnest_sql(anyarray) RETURNS SETOF > anyelement AS > $_$ > SELECT ($1)[i] FROM > generate_series(array_lower($1,1),array_upper($1,1)) i; > $_$ > LANGUAGE sql IMMUTABLE; > > create or replace function unnest_plpgsql(_a anyarray) returns setof > anyelement as $$ > begin > for i in array_lower(_a,1) .. array_upper(_a,1) loop > return next _a[i]; > end loop; > return; > end; > $$ language plpgsql strict immutable;
Thanks for all the replies. I'm going to post the results of using the recommended approach in another thread. On Fri, May 29, 2009 at 1:18 PM, Adam Ruth <adamruth@mac.com> wrote: > Good point, I should have specified 8.3.7. > > Just one more reason to anxiously anticipate upgrading to 8.4. > > > > On 30/05/2009, at 2:56 AM, Tom Lane wrote: > >> Adam Ruth <adamruth@mac.com> writes: >>> >>> Always test your performance assumptions. The plpgsql function is >>> faster than the sql function, a lot faster on smaller arrays. >> >> And, of course, it also pays to be precise about what you're testing >> and on what. Set-returning SQL functions got a lot faster in 8.4. >> Using CVS HEAD on a not-very-fast machine, I get these timings for >> the attached script (10000 loop iterations in all cases) >> >> 10 elements 100 elements 1000 elements >> >> built-in unnest 2.44 6.52 47.96 >> SQL function 2.52 6.50 46.71 >> plpgsql function 3.63 12.47 101.68 >> >> So at least in this specific test condition, there's not much >> perceptible difference between the SQL function and the builtin, >> while plpgsql lags behind. >> >> regards, tom lane >> >> >> create or replace function testit(n int, l int) returns float8 as $$ >> declare arr int[]; >> st timestamptz; >> et timestamptz; >> begin >> arr := '{}'; >> for i in 1 .. n loop >> arr[i] = i; >> end loop; >> st := clock_timestamp(); >> for i in 1 .. l loop >> perform count(*) from unnest(arr); -- or unnest_sql or unnest_plpgsql >> end loop; >> et := clock_timestamp(); >> return extract(epoch from et - st); >> end $$ language plpgsql; >> >> CREATE or replace FUNCTION unnest_sql(anyarray) RETURNS SETOF anyelement >> AS >> $_$ >> SELECT ($1)[i] FROM generate_series(array_lower($1,1),array_upper($1,1)) >> i; >> $_$ >> LANGUAGE sql IMMUTABLE; >> >> create or replace function unnest_plpgsql(_a anyarray) returns setof >> anyelement as $$ >> begin >> for i in array_lower(_a,1) .. array_upper(_a,1) loop >> return next _a[i]; >> end loop; >> return; >> end; >> $$ language plpgsql strict immutable; > >