Re: Very specialised query - Mailing list pgsql-performance
From | Matthew Wakeling |
---|---|
Subject | Re: Very specialised query |
Date | |
Msg-id | alpine.DEB.2.00.0903311757030.21772@aragorn.flymine.org Whole thread Raw |
In response to | Re: Very specialised query (Віталій Тимчишин <tivv00@gmail.com>) |
Responses |
Re: Very specialised query
|
List | pgsql-performance |
On Mon, 30 Mar 2009, Віталій Тимчишин wrote: > select > case when n == 1 then id1 else id2 end, > case when n == 2 then id1 else id2 end > > from ( > SELECT > l1.id AS id1, > l2.id AS id2 > FROM > location l1, > location l2 > WHERE > l1.objectid = 228000093 > AND l2.objectid = 228000093 > AND l1.id <> l2.id > AND l1.start < l2.end > AND l1.end > l2.start > AND l1.start < l2.start) a, (values (1),(2)) b(n) It is a nice idea. However, the planner gets the join the wrong way round: select distinct case when n = 1 then id1 else id2 end, case when n = 1 then id2 else id1 end FROM ( select l1.id AS id1, l2.id AS id2 FROM location l1, location l2 WHERE l1.id <> l2.id AND l1.objectid = l2.objectid AND l1.start <= l2.end AND l2.start <= l1.end AND l1.start <= l2.start ) AS a, (values (1), (2)) b(n); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Unique (cost=7366497963.75..7637346831.94 rows=36113182426 width=12) (actual time=1642178.623..2206678.691 rows=139606782 loops=1) -> Sort (cost=7366497963.75..7456780919.81 rows=36113182426 width=12) (actual time=1642178.619..1899057.147 rows=166377424 loops=1) Sort Key: (CASE WHEN ("*VALUES*".column1 = 1) THEN l1.subjectid ELSE l2.subjectid END), (CASE WHEN ("*VALUES*".column1= 1) THEN l2.subjectid ELSE l1.subjectid END) Sort Method: external merge Disk: 3903272kB -> Nested Loop (cost=0.00..592890483.66 rows=36113182426 width=12) (actual time=85.333..984211.011 rows=166377424 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) (actual time=0.002..0.008 rows=2 loops=1) -> Nested Loop (cost=0.00..25596373.62 rows=18056591213 width=8) (actual time=42.684..322743.335 rows=83188712 loops=2) Join Filter: ((l1.subjectid <> l2.subjectid) AND (l1.intermine_start <= l2.intermine_end)) -> Seq Scan on location l1 (cost=0.00..78076.79 rows=3490079 width=16) (actual time=0.008..3629.672 rows=3490079 loops=2) -> Index Scan using location_test_obj_start on location l2 (cost=0.00..3.89 rows=152 width=16) (actual time=0.005..0.038 rows=25 loops=6980158) Index Cond: ((l2.objectid = l1.objectid) AND (l2.intermine_start <= l1.intermine_end) AND (l1.intermine_start<= l2.intermine_start)) Total runtime: 2339619.383 ms The outer nested join has the VALUES as the main loop, and the complicated join as the leaf. So, the complicated overlap-finding join gets run twice. Oh, there's also the great big sort and unique, but I think I can get rid of that. Matthew -- Contrary to popular belief, Unix is user friendly. It just happens to be very selective about who its friends are. -- Kyle Hearn
pgsql-performance by date: