3-table query optimization - Mailing list pgsql-performance
From | Michal Taborsky - Internet Mall |
---|---|
Subject | 3-table query optimization |
Date | |
Msg-id | 44D9EBEB.9010704@mall.cz Whole thread Raw |
Responses |
Re: 3-table query optimization
|
List | pgsql-performance |
Hello everyone. My (simplified) database structure is: a) table product (150000 rows) product_id BIGINT PRIMARY KEY title TEXT ... b) table action (5000 rows) action_id BIGINT PRIMARY KEY product_id BIGINT, FK to product shop_group_id INTEGER (there are about 5 groups, distributed about evenly) c) table product_program (500000 rows) program_id BIGINT (there are about 50 unique) product_id BIGINT, FK to product I need to query products, which are in action table of specific group and in product_program for a specific program_id. The query is taking too long to my liking My query is: SELECT product.product_id FROM action JOIN product ON (product.product_id=action.product_id) WHERE action.shop_group_id=1 AND EXISTS (SELECT 1 FROM catalog.product_program WHERE product_id=product.product_id AND product_program.program_id =1104322 ) QUERY PLAN Nested Loop (cost=0.00..18073.81 rows=1220 width=8) (actual time=10.153..2705.891 rows=636 loops=1) -> Seq Scan on "action" (cost=0.00..135.74 rows=2439 width=8) (actual time=8.108..36.684 rows=2406 loops=1) Filter: (shop_group_id = 1) -> Index Scan using product_pkey on product (cost=0.00..7.34 rows=1 width=8) (actual time=1.031..1.097 rows=0 loops=2406) Index Cond: ((product.product_id)::bigint = ("outer".product_id)::bigint) Filter: (subplan) SubPlan -> Index Scan using product_program_pkey on product_program (cost=0.00..4.33 rows=1 width=0) (actual time=0.455..0.455 rows=0 loops=2406) Index Cond: (((program_id)::bigint = 1104322) AND ((product_id)::bigint = ($0)::bigint)) Total runtime: 2708.575 ms I also tried this: SELECT product.product_id FROM action JOIN product ON (product.product_id=action.product_id) JOIN catalog.product_program ON ( product_program.product_id=product.product_id AND product_program.program_id =1104322) WHERE action.shop_group_id=1 With about the same results (a bit better, but for different groups it was vice versa): QUERY PLAN Nested Loop (cost=141.84..3494.91 rows=139 width=8) (actual time=118.584..1295.303 rows=636 loops=1) -> Hash Join (cost=141.84..2729.11 rows=253 width=16) (actual time=118.483..231.103 rows=636 loops=1) Hash Cond: (("outer".product_id)::bigint = ("inner".product_id)::bigint) -> Index Scan using product_program_pkey on product_program (cost=0.00..2470.04 rows=7647 width=8) (actual time=0.047..73.514 rows=7468 loops=1) Index Cond: ((program_id)::bigint = 1104322) -> Hash (cost=135.74..135.74 rows=2439 width=8) (actual time=118.114..118.114 rows=0 loops=1) -> Seq Scan on "action" (cost=0.00..135.74 rows=2439 width=8) (actual time=0.019..106.864 rows=2406 loops=1) Filter: (shop_group_id = 1) -> Index Scan using product_pkey on product (cost=0.00..3.01 rows=1 width=8) (actual time=1.300..1.655 rows=1 loops=636) Index Cond: (("outer".product_id)::bigint = (product.product_id)::bigint) Any ideas if this is really the best I can expect, or is there something amiss there and my query is wrong for this type of task? My gut feeling tells me, that this kind of query should be a lot faster. The hardware is Dual Xeon with enough of RAM and other operations run just fine. Thank you. -- Michal Táborský
pgsql-performance by date: