Thread: New array functions
So where are the new array functions and syntaces documented? Specifically I want to know how to replace my int_array_aggregate(int) and int_array_enum(_int) calls. And how to replace my "arr *= n" calls too. I think these are supposed be "ALL my_array" and "n = ANY myarray" or something like that? -- greg
Greg Stark wrote: > So where are the new array functions and syntaces documented? Mainly here: http://developer.postgresql.org/docs/postgres/sql-expressions.html#SQL-SYNTAX-ARRAY-CONSTRUCTORS http://developer.postgresql.org/docs/postgres/arrays.html http://developer.postgresql.org/docs/postgres/functions-array.html http://developer.postgresql.org/docs/postgres/functions-comparisons.html#AEN12154 > Specifically I want to know how to replace my int_array_aggregate(int) and > int_array_enum(_int) calls. I have no idea what those are -- are they from contrib? You can create an aggregate to turn arbitrary datatype elements into arrays like this: CREATE AGGREGATE array_aggregate ( BASETYPE = anyelement, SFUNC = array_append, STYPE = anyarray, INITCOND = '{}' ); -- silly example, but what the heck ;-) regression=# select attrelid, array_aggregate(attnum) from pg_attribute where attnum > 0 and attnum < 5 group by attrelid limit 3; attrelid | array_aggregate ----------+----------------- 16639 | {1} 16638 | {1} 17022 | {1,2,3,4} (3 rows) If int_array_enum() is supposed to take '{1,2,3}' and produce three rows, that function was proposed but rejected. Subsequently Peter Eisentraut pointed out a SQL99 syntax that does this, but I did not get it done for 7.4. Perhaps for 7.5. > And how to replace my "arr *= n" calls too. See: http://developer.postgresql.org/docs/postgres/functions-comparisons.html#AEN12154 regression=# SELECT g.grosysid, g.groname, s.usesysid, s.usename FROM pg_shadow s, pg_group g WHERE s.usesysid = any (g.grolist); grosysid | groname | usesysid | usename ----------+---------+----------+---------- 102 | admins | 1 | postgres 100 | grp1 | 100 | user1 101 | grp2 | 100 | user1 100 | grp1 | 101 | user2 100 | grp1 | 102 | user3 101 | grp2 | 102 | user3 102 | admins | 103 | john (7 rows) HTH, Joe
Joe Conway <mail@joeconway.com> writes: > Greg Stark wrote: > > So where are the new array functions and syntaces documented? > > Mainly here: > http://developer.postgresql.org/docs/postgres/arrays.html excellent. thank you. > > Specifically I want to know how to replace my int_array_aggregate(int) and > > int_array_enum(_int) calls. > > I have no idea what those are -- are they from contrib? indeed in contrib/intagg > You can create an aggregate to turn arbitrary datatype elements into arrays > like this: > > CREATE AGGREGATE array_aggregate > ( > BASETYPE = anyelement, > SFUNC = array_append, > STYPE = anyarray, > INITCOND = '{}' > ); Hm, perhaps there should be a standard name for this, rather than have everyone's code do their own thing. > If int_array_enum() is supposed to take '{1,2,3}' and produce three rows, that > function was proposed but rejected. Subsequently Peter Eisentraut pointed out a > SQL99 syntax that does this, but I did not get it done for 7.4. Perhaps for 7.5. That's exactly what it does. Hm, I guess I misinterpreted that post. Hm I have some work to do. Thanks. And thanks a LOT for doing the work, it'll makes a big difference and make arrays much more practical to use. -- greg
Joe Conway <mail@joeconway.com> writes: > Greg Stark wrote: >> And how to replace my "arr *= n" calls too. > See: > http://developer.postgresql.org/docs/postgres/functions-comparisons.html#AEN12154 That reminds me --- contrib/array is definitely obsolete now, and there may be parts of the other contrib array- and aggregate-related modules that are obsoleted by Joe's recent work. I would like to kill contrib/array for 7.4, because it's one of the GPL'd contrib modules that I was tasked to get rid of some time ago. What I'm thinking of doing is removing the code, and replacing the README with a note explaining how to convert contrib/array queries to use the new mainstream syntaxes. That will give contrib/array users a clue what they're supposed to do. In a release or three the README could go away too. Comments, objections? Also, does anyone want to look for possible dead code in intagg and so on? regards, tom lane
Tom Lane wrote: > What I'm thinking of doing is removing the code, and replacing the > README with a note explaining how to convert contrib/array queries to > use the new mainstream syntaxes. That will give contrib/array users > a clue what they're supposed to do. In a release or three the README > could go away too. I have no objection to removing it now, but previously I think you agreed with Bruce's comment that we should leave it intact (but deprecated) for 7.4, and remove in 7.5. > Also, does anyone want to look for possible dead code in intagg and > so on? I did a quick review back in July. IIRC, intagg could be functionally replaced with the aggregate definition that I posted, except that intagg is probably a fair bit better performance (I didn't actually test), in that it accumulates the array in backend memory and just pushes pointers around as int4's. I've thought that a safer implementation would be needed to fold it into the backend (maybe using hashes keyed with the pointer?), but in any case that's a 7.5 thing. Also IIRC there were some functions in intarray that overlap the new backend functionality, but much of it (i.e. using GIST to index into large arrays) is not. I'll try to review them again and make a recommendation in the next couple of days, but it might be a stretch because I'm trying to tie up lots of loose ends in preparation for a trip next week. Joe
Joe Conway <mail@joeconway.com> writes: > I have no objection to removing it now, but previously I think you > agreed with Bruce's comment that we should leave it intact (but > deprecated) for 7.4, and remove in 7.5. Did we discuss this already? I'd forgotten. In any case, the module isn't visibly deprecated at the moment. If the idea is to avoid blindsiding its users, then we definitely must mark it as slated for removal, and provide some docs about how to replace it. regards, tom lane
Hannu Krosing wrote: > Any idea of performance - is this array_aggregate(anyelement) faster, > slower or about same than int_array_aggregate(int) ? I haven't tested, but I'd guess for an array of any significant length int_array_aggregate() is faster (see my other post). That's one of the reasons I haven't advocated deprecating intagg yet. Joe
Joe Conway kirjutas N, 28.08.2003 kell 21:15: > Greg Stark wrote: > > Specifically I want to know how to replace my int_array_aggregate(int) and > > int_array_enum(_int) calls. > > I have no idea what those are -- are they from contrib? > > You can create an aggregate to turn arbitrary datatype elements into > arrays like this: > > CREATE AGGREGATE array_aggregate > ( > BASETYPE = anyelement, > SFUNC = array_append, > STYPE = anyarray, > INITCOND = '{}' > ); Any idea of performance - is this array_aggregate(anyelement) faster, slower or about same than int_array_aggregate(int) ? > If int_array_enum() is supposed to take '{1,2,3}' and produce three > rows, that function was proposed but rejected. Subsequently Peter > Eisentraut pointed out a SQL99 syntax that does this, but I did not get > it done for 7.4. Perhaps for 7.5. So we got to keep intagg at least until 7.5 ... ----------- Hannu
> See: > http://developer.postgresql.org/docs/postgres/functions-comparisons.html#AEN12154 > > regression=# SELECT g.grosysid, g.groname, s.usesysid, s.usename FROM pg_shadow > s, pg_group g WHERE s.usesysid = any (g.grolist); These forms below are all equivalent, right? If so ideally they would all be converted to an equivalent form and therefore produce the same plan. I guess I'm wishing for a pony though. But I think currently I'm stuck with the worst of these and I don't see any way of escaping to the better plans. Incidentally, "HashAggregate"?! Based on the earlier discussion on this I would have expected that line to read "Materialize" slo=> explain select * from store_location where store_location_id in (1,2,3); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------Index Scanusing store_location_pkey, store_location_pkey, store_location_pkey on store_location (cost=0.00..17.74 rows=3 width=523) Index Cond: ((store_location_id = 1) OR (store_location_id = 2) OR (store_location_id = 3)) (2 rows) slo=> explain select * from store_location where store_location_id in (select 1 union all select 2 union all select 3); QUERY PLAN --------------------------------------------------------------------------------------------------Nested Loop (cost=0.10..17.86rows=3 width=523) -> HashAggregate (cost=0.10..0.10 rows=3 width=4) -> Subquery Scan "IN_subquery" (cost=0.00..0.09 rows=3 width=4) -> Append (cost=0.00..0.06 rows=3 width=0) -> Subquery Scan "*SELECT* 1" (cost=0.00..0.02 rows=1 width=0) -> Result (cost=0.00..0.01rows=1 width=0) -> Subquery Scan "*SELECT* 2" (cost=0.00..0.02 rows=1 width=0) -> Result (cost=0.00..0.01 rows=1 width=0) -> Subquery Scan "*SELECT* 3" (cost=0.00..0.02rows=1 width=0) -> Result (cost=0.00..0.01 rows=1 width=0) -> Index Scan usingstore_location_pkey on store_location (cost=0.00..5.91 rows=1 width=523) Index Cond: (store_location.store_location_id= "outer"."?column?") (12 rows) slo=> explain select * from store_location where store_location_id = any (array[1,2,3]); QUERYPLAN ---------------------------------------------------------------------Seq Scan on store_location (cost=0.00..825.75 rows=5954width=523) Filter: (store_location_id = ANY ('{1,2,3}'::integer[])) (2 rows) -- greg
Tom Lane wrote: > Joe Conway <mail@joeconway.com> writes: > >>I have no objection to removing it now, but previously I think you >>agreed with Bruce's comment that we should leave it intact (but >>deprecated) for 7.4, and remove in 7.5. > > Did we discuss this already? I'd forgotten. > > In any case, the module isn't visibly deprecated at the moment. > If the idea is to avoid blindsiding its users, then we definitely > must mark it as slated for removal, and provide some docs about > how to replace it. > I can't find it in the archives for some reason, but here was the exchange: Tom Lane wrote:> Bruce Momjian <pgman@candle.pha.pa.us> writes:>>>Joe Conway wrote:>>>>>I do agree that it makes contrib/arrayunnecessary. I was going to>>>suggest we remove that if this was committed.>>>Good idea.>> We could do that,but it might be more friendly to just mark it as> deprecated for one release cycle before zapping it. That'd give>people who use it some time to convert over. So I guess since it was actually you who objected, you have the right to change your mind ;-) Joe
Greg Stark <gsstark@mit.edu> writes: > Incidentally, "HashAggregate"?! Based on the earlier discussion on this I > would have expected that line to read "Materialize" It's using a grouped aggregation node to implement a UNIQUE filter, so that it can replace the "WHERE foo IN (subselect)" by a straight join. Of course in this case the uniqueness filter is a waste of time, but in general the planner can't be expected to know that. regards, tom lane