directory tree query with big planner variation - Mailing list pgsql-performance
From | Axel Rau |
---|---|
Subject | directory tree query with big planner variation |
Date | |
Msg-id | 79BB9B3D-BCE5-46AA-82DF-BB6B1D0FE2A7@Chaos1.DE Whole thread Raw |
Responses |
Re: directory tree query with big planner variation
|
List | pgsql-performance |
Hi group, this is a directory tree query for a backup system (http:// sourceforge.net/projects/bacula). You provide a path and get back the names of the children plus a boolean telling if the child has itself children. The "%@" stands for the initial path: --------------------------------------------------------------- explain analyze SELECT X.name AS name, COUNT(CH) > 1 AS children FROM ( SELECT RTRIM( REPLACE( NLPC.path, '%@/', ''),'/') AS name, FN.name AS CH FROM ( SELECT P.path,P.pathid FROM bacula.path P WHERE P.path ~ '^%@/[^/]*/$' ) AS NLPC LEFT OUTER JOIN bacula.file F ON NLPC.pathid = F.pathid LEFT OUTER JOIN bacula.filename FN ON F.filenameid = FN.filenameid GROUP BY NLPC.path, FN.name UNION SELECT FN.name AS name, FN.name AS CH FROM bacula.path P, bacula.file F, bacula.filename FN WHERE P.path = '%@/' AND P.pathid = F.pathid AND F.filenameid = FN.filenameid ) AS X WHERE X.name <> '' GROUP BY X.name --------------------------------------------------------------- The 1st part of the union takes care of directories, the 2nd one of flat files. Application allows user navigation in a browser (clicking on a name in one column triggers the query and fills the next browser column). Initial path of "/Users/axel/Library/Preferences" results in: --------------------------------------------------------------- Sort (cost=1295.24..1295.47 rows=92 width=64) (actual time=818.987..819.871 rows=527 loops=1) Sort Key: upper(x.name) -> GroupAggregate (cost=1288.56..1292.24 rows=92 width=64) (actual time=784.069..814.059 rows=527 loops=1) -> Unique (cost=1288.56..1289.25 rows=92 width=112) (actual time=783.931..809.708 rows=684 loops=1) -> Sort (cost=1288.56..1288.79 rows=92 width=112) (actual time=783.921..793.150 rows=5350 loops=1) Sort Key: name, ch -> Append (cost=642.03..1285.55 rows=92 width=112) (actual time=335.134..723.917 rows=5350 loops=1) -> Subquery Scan "*SELECT* 1" (cost=642.03..643.18 rows=46 width=112) (actual time=335.130..338.564 rows=184 loops=1) -> HashAggregate (cost=642.03..642.72 rows=46 width=112) (actual time=335.121..337.843 rows=184 loops=1) -> Nested Loop Left Join (cost=2.00..641.80 rows=46 width=112) (actual time=39.293..326.831 rows=1685 loops=1) -> Nested Loop Left Join (cost=0.00..502.63 rows=46 width=97) (actual time=21.026..202.206 rows=1685 loops=1) -> Index Scan using path_name_idx on path p (cost=0.00..3.02 rows=1 width=97) (actual time=15.480..56.935 rows=27 loops=1) Index Cond: ((path >= '/Users/axel/Library/Preferences/'::text) AND (path < '/ Users/axel/Library/Preferences0'::text)) Filter: ((path ~ '^/Users/axel/Library/Preferences/[^/]*/$'::text) AND (rtrim ("replace"(path, '/Users/axel/Library/Preferences/'::text, ''::text), '/'::text) <> ''::text)) -> Index Scan using file_path_idx on file f (cost=0.00..493.28 rows=506 width=8) (actual time=0.473..5.119 rows=62 loops=27) Index Cond: ("outer".pathid = f.pathid) -> Bitmap Heap Scan on filename fn (cost=2.00..3.01 rows=1 width=23) (actual time=0.058..0.061 rows=1 loops=1685) Recheck Cond: ("outer".filenameid = fn.filenameid) -> Bitmap Index Scan on filename_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1685) Index Cond: ("outer".filenameid = fn.filenameid) -> Nested Loop (cost=2.00..641.91 rows=46 width=19) (actual time=3.349..377.758 rows=5166 loops=1) -> Nested Loop (cost=0.00..502.62 rows=46 width=4) (actual time=3.118..97.375 rows=5200 loops=1) -> Index Scan using path_name_idx on path p (cost=0.00..3.01 rows=1 width=4) (actual time=0.045..0.052 rows=1 loops=1) Index Cond: (path = '/ Users/axel/Library/Preferences/'::text) -> Index Scan using file_path_idx on file f (cost=0.00..493.28 rows=506 width=8) (actual time=3.058..76.014 rows=5200 loops=1) Index Cond: ("outer".pathid = f.pathid) -> Bitmap Heap Scan on filename fn (cost=2.00..3.02 rows=1 width=23) (actual time=0.037..0.039 rows=1 loops=5200) Recheck Cond: ("outer".filenameid = fn.filenameid) Filter: (name <> ''::text) -> Bitmap Index Scan on filename_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.018..0.018 rows=1 loops=5200) Index Cond: ("outer".filenameid = fn.filenameid) Total runtime: 832.458 ms --------------------------------------------------------------- which is ok, but initial path of "/Users/axel" results in (which is not ok): --------------------------------------------------------------- Sort (cost=125533.67..125534.17 rows=200 width=64) (actual time=84273.963..84274.260 rows=181 loops=1) Sort Key: upper(x.name) -> GroupAggregate (cost=123493.01..125526.03 rows=200 width=64) (actual time=84263.411..84272.427 rows=181 loops=1) -> Unique (cost=123493.01..124169.51 rows=90201 width=112) (actual time=84263.215..84270.129 rows=522 loops=1) -> Sort (cost=123493.01..123718.51 rows=90201 width=112) (actual time=84263.208..84265.632 rows=1432 loops=1) Sort Key: name, ch -> Append (cost=113172.83..116069.08 rows=90201 width=112) (actual time=83795.274..84251.660 rows=1432 loops=1) -> Subquery Scan "*SELECT* 1" (cost=113172.83..115426.71 rows=90155 width=112) (actual time=83795.270..83803.996 rows=410 loops=1) -> HashAggregate (cost=113172.83..114525.16 rows=90155 width=112) (actual time=83795.258..83802.369 rows=410 loops=1) -> Hash Left Join (cost=3124.38..112722.06 rows=90155 width=112) (actual time=56254.547..83779.903 rows=3648 loops=1) Hash Cond: ("outer".filenameid = "inner".filenameid) -> Merge Left Join (cost=0.00..106216.87 rows=90155 width=97) (actual time=54926.198..82430.621 rows=3648 loops=1) Merge Cond: ("outer".pathid = "inner".pathid) -> Index Scan using path_pkey on path p (cost=0.00..2567.57 rows=1941 width=97) (actual time=527.805..1521.911 rows=69 loops=1) Filter: ((path ~ '^/Users/axel/[^/]*/$'::text) AND (rtrim("replace"(path, '/ Users/axel/'::text, ''::text), '/'::text) <> ''::text)) -> Index Scan using file_path_idx on file f (cost=0.00..95191.99 rows=3020363 width=8) (actual time=17.561..74392.318 rows=2941790 loops=1) -> Hash (cost=2723.30..2723.30 rows=160430 width=23) (actual time=1299.103..1299.103 rows=160430 loops=1) -> Seq Scan on filename fn (cost=0.00..2723.30 rows=160430 width=23) (actual time=3.884..684.918 rows=160430 loops=1) -> Nested Loop (cost=2.00..641.91 rows=46 width=19) (actual time=93.252..442.196 rows=1022 loops=1) -> Nested Loop (cost=0.00..502.62 rows=46 width=4) (actual time=49.375..209.694 rows=1050 loops=1) -> Index Scan using path_name_idx on path p (cost=0.00..3.01 rows=1 width=4) (actual time=29.455..29.462 rows=1 loops=1) Index Cond: (path = '/ Users/axel/'::text) -> Index Scan using file_path_idx on file f (cost=0.00..493.28 rows=506 width=8) (actual time=19.898..175.869 rows=1050 loops=1) Index Cond: ("outer".pathid = f.pathid) -> Bitmap Heap Scan on filename fn (cost=2.00..3.02 rows=1 width=23) (actual time=0.206..0.208 rows=1 loops=1050) Recheck Cond: ("outer".filenameid = fn.filenameid) Filter: (name <> ''::text) -> Bitmap Index Scan on filename_pkey (cost=0.00..2.00 rows=1 width=0) (actual time=0.087..0.087 rows=1 loops=1050) Index Cond: ("outer".filenameid = fn.filenameid) Total runtime: 84295.927 ms --------------------------------------------------------------- It happened once that the planner resolved the 2nd query with the 1st plan, but this is not reproducible. How can I avoid the 2nd plan? This is 8.1.4 on OpenBSD 3.9 with 2x1GHz PIII and 2GB. Axel Axel Rau, ☀Frankfurt , Germany +49-69-951418-0
pgsql-performance by date: