multiple joins + Order by + LIMIT query performance issue - Mailing list pgsql-performance
From | Antoine Baudoux |
---|---|
Subject | multiple joins + Order by + LIMIT query performance issue |
Date | |
Msg-id | 938EC908-FB9A-4BFA-950A-053BE3C87085@taktik.be Whole thread Raw |
Responses |
Re: multiple joins + Order by + LIMIT query performance issue
Re: multiple joins + Order by + LIMIT query performance issue |
List | pgsql-performance |
Hello, I have a query that runs for hours when joining 4 tables but takes milliseconds when joining one MORE table to the query. I have One big table, t_event (8 million rows) and 4 small tables (t_network,t_system,t_service, t_interface, all < 1000 rows). This query takes a few milliseconds : [code] select * from t_Event event inner join t_Service service on event.service_id=service.id inner join t_System system on service.system_id=system.id inner join t_Interface interface on system.id=interface.system_id inner join t_Network network on interface.network_id=network.id where (network.customer_id=1) order by event.c_date desc limit 25 "Limit (cost=23981.18..23981.18 rows=1 width=977)" " -> Sort (cost=23981.18..23981.18 rows=1 width=977)" " Sort Key: this_.c_date" " -> Nested Loop (cost=0.00..23981.17 rows=1 width=977)" " -> Nested Loop (cost=0.00..23974.89 rows=1 width=961)" " -> Nested Loop (cost=0.00..191.42 rows=1 width=616)" " Join Filter: (service_s3_.system_id = service1_.system_id)" " -> Nested Loop (cost=0.00..9.29 rows=1 width=576)" " -> Seq Scan on t_network service_s4_ (cost=0.00..1.01 rows=1 width=18)" " Filter: (customer_id = 1)" " -> Index Scan using interface_network_id_idx on t_interface service_s3_ (cost=0.00..8.27 rows=1 width=558)" " Index Cond: (service_s3_.network_id = service_s4_.id)" " -> Seq Scan on t_service service1_ (cost=0.00..109.28 rows=5828 width=40)" " -> Index Scan using event_svc_id_idx on t_event this_ (cost=0.00..23681.12 rows=8188 width=345)" " Index Cond: (this_.service_id = service1_.id)" " -> Index Scan using t_system_pkey on t_system service_s2_ (cost=0.00..6.27 rows=1 width=16)" " Index Cond: (service_s2_.id = service1_.system_id)" [/code] This one takes HOURS, but I'm joining one table LESS : [code] select * from t_Event event inner join t_Service service on event.service_id=service.id inner join t_System system on service.system_id=system.id inner join t_Interface interface on system.id=interface.system_id where (interface.network_id=1) order by event.c_date desc limit 25 "Limit (cost=147.79..2123.66 rows=10 width=959)" " -> Nested Loop (cost=147.79..2601774.46 rows=13167 width=959)" " Join Filter: (service1_.id = this_.service_id)" " -> Index Scan Backward using event_date_idx on t_event this_ (cost=0.00..887080.22 rows=8466896 width=345)" " -> Materialize (cost=147.79..147.88 rows=9 width=614)" " -> Hash Join (cost=16.56..147.79 rows=9 width=614)" " Hash Cond: (service1_.system_id = service_s2_.id)" " -> Seq Scan on t_service service1_ (cost=0.00..109.28 rows=5828 width=40)" " -> Hash (cost=16.55..16.55 rows=1 width=574)" " -> Nested Loop (cost=0.00..16.55 rows=1 width=574)" " -> Index Scan using interface_network_id_idx on t_interface service_s3_ (cost=0.00..8.27 rows=1 width=558)" " Index Cond: (network_id = 1)" " -> Index Scan using t_system_pkey on t_system service_s2_ (cost=0.00..8.27 rows=1 width=16)" " Index Cond: (service_s2_.id = service_s3_.system_id)" [/code] My understanding is that in the first case the sort is done after all the table joins and filtering, but in the second case ALL the rows in t_event are scanned and sorted before the join. There is an index on the sorting column. If I remove this index, the query runs very fast. But I still need this index for other queries.So I must force the planner to do the sort after the join, in the second case. How can i do that? Thanks a lot for your help, Antoine
pgsql-performance by date: