Thread: subselect instead of a view...
Create view WLE_pathnames as SELECT E.name, EP.pathname FROM element E, element_pathnames EP, watch_list_element WLE WHEREWLE.watch_list_id = 3724 and WLE.element_id = E.id and E.id = EP.id; name | pathname -------------+-----------------------------euchre | /ports/games/euchrereallyslick | /ports/graphics/reallyslick The above query is pretty well optimized: Nested Loop (cost=0.00..647.08 rows=62 width=61) (actual time=0.99..1.19 rows=2 loops=1) -> Nested Loop (cost=0.00..437.06 rows=62 width=20) (actual time=0.66..0.78 rows=2 loops=1) -> Index Scan using watch_list_element_pkey on watch_list_element wle (cost=0.00..229.64 rows=62 width=4) (actual time=0.34..0.36 rows=2 loops=1) -> Index Scan using element_pkey on element e (cost=0.00..3.34 rows=1 width=16) (actual time=0.16..0.17 rows=1 loops=2) -> Index Scan using element_pathnames_pkey on element_pathnames ep (cost=0.00..3.38 rows=1 width=41) (actual time=0.16..0.17rows=1 loops=2) Total runtime: 1.44 msec Now I want to get all the stuff from element_pathnames like pathname || '/%'. Does that make sense? Essentially, I want this using the above view: explain analyzeSELECT E.id, CLE.commit_log_id, E.name, EP.pathname FROM element E, element_pathnames EP, commit_log_elements CLE, WLE_pathnames WLEP WHERE E.id = EP.id AND EP.pathname like WLEP.pathname || '/%' AND CLE.element_id = E.id ORDER BY EP.pathname; I know this can be done better, I just can't figure out how. I keep thinking of a subselect but I'm totally blocked. It must be bed time. Sort (cost=285579.85..285579.85 rows=67012 width=114) (actual time=9463.95..9464.01 rows=11 loops=1) -> Hash Join (cost=264060.42..272748.13 rows=67012 width=114) (actual time=9154.69..9463.55 rows=11 loops=1) -> Seq Scan on commit_log_elements cle (cost=0.00..3936.75 rows=216575 width=8) (actual time=0.18..1762.38 rows=216575 loops=1) -> Hash (cost=263370.92..263370.92 rows=36997width=106) (actual time=5716.62..5716.62 rows=0 loops=1) -> Hash Join (cost=258032.99..263370.92 rows=36997 width=106) (actual time=5524.78..5695.47 rows=10 loops=1) -> Seq Scan on element e (cost=0.00..2286.70 rows=119570 width=16) (actual time=0.15..892.40 rows=119570 loops=1) -> Hash (cost=257416.50..257416.50rows=36997 width=90) (actual time=3481.05..3481.05 rows=0 loops=1) -> Nested Loop (cost=0.00..257416.50 rows=36997 width=90) (actual time=1847.01..3465.54 rows=10 loops=1) -> Nested Loop (cost=0.00..647.08 rows=62 width=49) (actual time=0.96..1.50 rows=2 loops=1) -> Nested Loop (cost=0.00..437.06 rows=62 width=8) (actual time=0.64..0.94 rows=2 loops=1) -> Index Scan using watch_list_element_pkey on watch_list_element wle (cost=0.00..229.64 rows=62 width=4) (actual time=0.34..0.37 rows=2 loops=1) -> Index Scan using element_pkey on element e (cost=0.00..3.34 rows=1 width=4) (actual time=0.21..0.22 rows=1 loops=2) -> Index Scan using element_pathnames_pkey on element_pathnames ep (cost=0.00..3.38 rows=1 width=41) (actual time=0.21..0.23 rows=1 loops=2) -> Seq Scan on element_pathnamesep (cost=0.00..2355.70 rows=119570 width=41) (actual time=0.08..858.74 rows=119570 loops=2) Total runtime: 9464.51 msec Clues please? -- Dan Langille : http://www.langille.org/
On 25 Nov 2002 at 22:15, Dan Langille wrote: > I know this can be done better, I just can't figure out how. I keep > thinking of a subselect but I'm totally blocked. It must be bed time. It's odd what reading the paper, relaxing with a book, and then trying to sleep can generate. There I was, almost dropping off, when I realised I needed this: SELECT E.name, EP.pathname, E.id, EP2.pathname, CL.id FROM element E, element_pathnames EP, watch_list_element WLE, element_pathnames EP2, element E2, commit_log_elements CLE, commit_log CL WHERE WLE.watch_list_id = 3724 and WLE.element_id = E.id and E.id = EP.id andEP2.pathname like EP.pathname || '/%' AND EP2.id = E2.id AND E2.id = CLE.element_id ANDCLE.commit_log_id = CL.id; I am still suspicous of that like. It seems to be the performance killer here. There is an index which can be used: # explain select * from element_pathnames WHERE pathname like 'abc%'; NOTICE: QUERY PLAN: Index Scan using element_pathnames_pathname on element_pathnames (cost=0.00..5.80 rows=1 width=41) But in the main query, it doesn't get picked up. The explain appears below (and at http://www.freshports.org/tmp/explain.txt which will be easier to read than this text-wrapped version). There are quite a few sequential scans there. I'm confused as to why the indexes are not being used. A "vacuum analyze" has been run. Thanks. Hash Join (cost=266574.28..279596.82 rows=67012 width=118) -> Hash Join (cost=263685.03..272372.74 rows=67012 width=114) -> Seq Scan on commit_log_elements cle (cost=0.00..3936.75 rows=216575 width=8) -> Hash (cost=262995.54..262995.54 rows=36997 width=106) -> Hash Join (cost=2994.62..262995.54rows=36997 width=106) -> Nested Loop (cost=0.00..257416.50 rows=36997 width=102) -> Nested Loop (cost=0.00..647.08 rows=62 width=61) -> Nested Loop (cost=0.00..437.06 rows=62 width=20) -> Index Scan using watch_list_element_pkey on watch_list_element wle (cost=0.00..229.64 rows=62 width=4) -> Index Scan using element_pkey on element e (cost=0.00..3.34 rows=1 width=16) -> Index Scan using element_pathnames_pkey on element_pathnames ep (cost=0.00..3.38 rows=1 width=41) -> Seq Scan on element_pathnames ep2 (cost=0.00..2355.70 rows=119570 width=41) -> Hash (cost=2286.70..2286.70 rows=119570 width=4) -> Seq Scan on element e2 (cost=0.00..2286.70 rows=119570 width=4) -> Hash (cost=2543.20..2543.20 rows=58420 width=4) -> Seq Scan on commit_logcl (cost=0.00..2543.20 rows=58420 width=4) -- Dan Langille : http://www.langille.org/
"Dan Langille" <dan@langille.org> writes: > SELECT E.name, EP.pathname, E.id, EP2.pathname, CL.id > ... > and EP2.pathname like EP.pathname || '/%' > I am still suspicous of that like. It seems to be the performance > killer here. There is an index which can be used: It won't be, though. The LIKE-to-indexscan transformation happens at plan time, and that means it can only happen if the pattern is a constant. Which it surely will not be in your example. regards, tom lane
On 26 Nov 2002 at 0:29, Tom Lane wrote: > "Dan Langille" <dan@langille.org> writes: > > SELECT E.name, EP.pathname, E.id, EP2.pathname, CL.id > > ... > > and EP2.pathname like EP.pathname || '/%' > > > I am still suspicous of that like. It seems to be the performance > > killer here. There is an index which can be used: > > It won't be, though. The LIKE-to-indexscan transformation happens at > plan time, and that means it can only happen if the pattern is a > constant. Which it surely will not be in your example. Thanks. I'll see if I can come up with something else to do this. -- Dan Langille : http://www.langille.org/