Query becomes slow when written as view - Mailing list pgsql-general
From | Jan Strube |
---|---|
Subject | Query becomes slow when written as view |
Date | |
Msg-id | 511CE547.4040500@deriva.de Whole thread Raw |
Responses |
Re: Query becomes slow when written as view
|
List | pgsql-general |
Hi, I have the following query which runs reasonably fast under PostgreSQL=20 9.1.8: SELECT b."ISIN", CASE WHEN b."COMMENT" IS NOT NULL THEN b."COMMENT" WHEN cc."ISIN" IS NOT NULL THEN cc.comment ELSE get_comment(b."ISIN") END AS "COMMENT" FROM dtng."Z_BASE" b LEFT JOIN dtng.cached_comments cc on b."ISIN" =3D cc."ISIN" AND=20 cc.cache_time >=3D b._last_modified WHERE b."ISIN" IN (SELECT "ISIN" from dtng."Z_BASE" LIMIT 1) Here is the query plan: Nested Loop Left Join (cost=3D0.08..16.65 rows=3D1 width=3D1053) Join Filter: (cc.cache_time >=3D b._last_modified) -> Nested Loop (cost=3D0.08..8.67 rows=3D1 width=3D644) -> HashAggregate (cost=3D0.08..0.09 rows=3D1 width=3D13) -> Subquery Scan on "ANY_subquery" (cost=3D0.00..0.08 rows=3D1 width=3D13= ) -> Limit (cost=3D0.00..0.07 rows=3D1 width=3D13) -> Seq Scan on "Z_BASE" (cost=3D0.00..106515.68 rows=3D1637368 width=3D13= ) -> Index Scan using "Z_BASE_pkey" on "Z_BASE" b (cost=3D0.00..8.57 rows=3D= 1=20 width=3D644) Index Cond: (("ISIN")::bpchar =3D ("ANY_subquery"."ISIN")::bpchar) -> Index Scan using cached_comments_pkey on cached_comments cc=20 (cost=3D0.00..7.71 rows=3D1 width=3D425) Index Cond: ((b."ISIN")::bpchar =3D ("ISIN")::bpchar) When I=C2=B4m trying to put this into a view, it becomes extremely slow: CREATE VIEW export_comments AS SELECT b."ISIN", CASE WHEN b."COMMENT" IS NOT NULL THEN b."COMMENT" WHEN cc."ISIN" IS NOT NULL THEN cc.comment ELSE get_comment(b."ISIN") END AS "COMMENT" FROM dtng."Z_BASE" b LEFT JOIN dtng.cached_comments cc on b."ISIN" =3D cc."ISIN" AND=20 cc.cache_time >=3D b._last_modified SELECT * FROM export_comments WHERE "ISIN" IN (SELECT "ISIN" FROM dtng."Z_BASE" LIMIT 1) The query plan now is: Hash Join (cost=3D79926.52..906644.87 rows=3D818684 width=3D45) Hash Cond: ((b."ISIN")::bpchar =3D ("ANY_subquery"."ISIN")::bpchar) -> Hash Left Join (cost=3D79926.42..884049.08 rows=3D1637368 width=3D1053= ) Hash Cond: ((b."ISIN")::bpchar =3D (cc."ISIN")::bpchar) Join Filter: (cc.cache_time >=3D b._last_modified) -> Seq Scan on "Z_BASE" b (cost=3D0.00..106515.68 rows=3D1637368 width=3D= 644) -> Hash (cost=3D74620.41..74620.41 rows=3D77841 width=3D425) -> Seq Scan on cached_comments cc (cost=3D0.00..74620.41 rows=3D77841 wid= th=3D425) -> Hash (cost=3D0.09..0.09 rows=3D1 width=3D13) -> HashAggregate (cost=3D0.08..0.09 rows=3D1 width=3D13) -> Subquery Scan on "ANY_subquery" (cost=3D0.00..0.08 rows=3D1 width=3D13= ) -> Limit (cost=3D0.00..0.07 rows=3D1 width=3D13) -> Seq Scan on "Z_BASE" (cost=3D0.00..106515.68 rows=3D1637368 width=3D13= ) By the way I get the same behaviour and query plan when I try this: SELECT * FROM ( -- above view definition ) x WHERE x."ISIN" IN (SELECT "ISIN" FROM dtng."Z_BASE" LIMIT 1) We already found out that the problem is the Perl function "get_comment"=20 which is very expensive. In the first case the function is called at=20 most once, but in the second case it is called many times. I believe=20 this is because of the hash join which causes the view to fetch=20 everything from dtng."Z_BASE" first? The question is, how to avoid this? We tried to set the functions cost=20 from 100 to 10000000 but that did not help. (Because of the architecture=20 of the software that uses this query, we have the constraint that=20 structure of the final WHERE clause (WHERE "ISIN" IN (...)) must not be=20 altered.) Thanks a lot for any idea, Jan
pgsql-general by date: