index on custom function; explain - Mailing list pgsql-performance
From | jan.aerts@bbsrc.ac.uk |
---|---|
Subject | index on custom function; explain |
Date | |
Msg-id | 1128352451.312090.37640@g44g2000cwa.googlegroups.com Whole thread Raw |
Responses |
Re: index on custom function; explain
|
List | pgsql-performance |
Hi, I'm trying to include a custom function in my SQL-queries, which unfortunately leaves the server hanging... I basically search through two tables: * TABLE_MAPPING: lists that 'abc' is mapped to 'def' id1 | name1 | id2 | name2 ------------------------- 1 | abc | 2 | def 3 | uvw | 4 | xyz This data means that 'abc' is mapped_to 'def', and 'uvw' is mapped_to 'xyz'. About 1,500,000 records in total. * TABLE ALIASES: lists different aliases of the same thing id1 | name1 | id2 | name2 ------------------------- 3 | uvw | 2 | def This data means that 'uvw' and 'def' are essentially the same thing. About 820,000 records in total. I have indexes on all columns of the above tables. Based on the two tables above, 'abc' is indirectly mapped_to 'xyz' as well (through 'def' also-known-as 'uvw'). I wrote this little function: aliases_of CREATE FUNCTION aliases_of(INTEGER) RETURNS SETOF integer AS 'SELECT $1 UNION SELECT id1 FROM aliases WHERE id2 = $1 UNION SELECT id2 FROM aliases WHERE id1 = $1 ' LANGUAGE SQL STABLE; A simple SELECT aliases_of(2) shows: aliases_of ---------- 2 3 Now, when I want to traverse the aliases, I would write a query as follows: SELECT m1.name1, m1.name2, m2.name1, m2.name2 FROM mappings m1, mappings m2 WHERE m1.name1 = 'abc' AND m2.name1 IN (SELECT aliases_of(m1.name2)); Unfortunately, this query seems to keep running and to never stop... An EXPLAIN of the above query shows: QUERY PLAN ----------------------------------------------------------------------------- Nested Loop (cost=0.00..118379.45 rows=1384837 width=80) Join Filter: (subplan) -> Index Scan using ind_cmappings_object1_id on c_mappings m1 (cost=0.00..7.08 rows=2 width=40) Index Cond: (name1 = 'abc') -> Seq Scan on c_mappings m2 (cost=0.00..35935.05 rows=1423805 width=40) SubPlan -> Result (cost=0.00..0.01 rows=1 width=0) (7 rows) Strangely enough, I _do_ get output when I type the following query: SELECT m1.name1, m1.name2, m2.name1, m2.name2 FROM mappings m1, mappings m2 WHERE m1.name1 = 'abc' AND m2.name1 IN ( SELECT m1.name2 UNION SELECT name2 FROM aliases WHERE name1 = m1.name2 UNION SELECT name1 FROM aliases WHERE name2 = m2.name1 ); The EXPLAIN for this query is: QUERY PLAN ------------------------------------------------------------------------------- Nested Loop (cost=0.00..36712030.90 rows=1384837 width=80) Join Filter: (subplan) -> Index Scan using ind_cmappings_object1_id on c_mappings m1 (cost=0.00..7.08 rows=2 width=40) Index Cond: (object1_id = 16575564) -> Seq Scan on c_mappings m2 (cost=0.00..35935.05 rows=1423805 width=40) SubPlan -> Unique (cost=13.21..13.23 rows=1 width=4) -> Sort (cost=13.21..13.22 rows=3 width=4) Sort Key: object2_id -> Append (cost=0.00..13.18 rows=3 width=4) -> Subquery Scan "*SELECT* 1" (cost=0.00..0.01 rows=1 width=0) -> Result (cost=0.00..0.01 rows=1 width=0) -> Subquery Scan "*SELECT* 2" (cost=0.00..5.92 rows=1 width=4) -> Index Scan using ind_caliases_object2_id on c_aliases (cost=0.00..5.92 rows=1 width=4) Index Cond: (object2_id = $0) -> Subquery Scan "*SELECT* 3" (cost=0.00..7.25 rows=1 width=4) -> Index Scan using ind_caliases_object1_id on c_aliases (cost=0.00..7.25 rows=1 width=4) Index Cond: (object1_id = $0) (18 rows) So my questions are: * Does anyone have any idea how I can integrate a function that lists all aliases for a given name into such a mapping query? * Does the STABLE keyword in the function definition make the function to read all its data into memory? * Is there a way to let postgres use an "Index scan" on that function instead of a "seq scan"? Any help very much appreciated, Jan Aerts
pgsql-performance by date: