Hello list,
We have a database keeping track of old files on different computers.
We have now added some search functionality to this system.
The problem is that on some searches it is really really slow and the
problem lies in the planner are using seq scans on tables with over
20 million rows.
We added appropriate indexes and raised the statistics on some
columns (really having hard time tracking down which column I should
raise it for and how much) but still the problems occurs.
Below is the query and the explain analyze. Any suggestions will be
greatly appreciated!
Thanks,
Henrik
explain analyze (SELECT max(pk_file_structure_id) as pk_object_id, max
(fk_archive_id) AS fk_archive_id, file_name AS
object_name,
structure_path AS object_path, computer_name,
file_ctime
AS object_ctime, pk_computer_id, filetype_icon AS object_img, 'file'
AS object_type, share_name, share_path FROM tbl_file_structure
JOIN tbl_file ON pk_file_id = fk_file_id
JOIN tbl_structure ON pk_structure_id = fk_structure_id
JOIN tbl_archive ON pk_archive_id = fk_archive_id
JOIN tbl_share ON pk_share_id = fk_share_id
JOIN tbl_computer ON pk_computer_id = fk_computer_id
JOIN tbl_filetype ON pk_filetype_id = fk_filetype_id
JOIN tbl_acl ON fk_file_structure_id = pk_file_structure_id
LEFT OUTER JOIN tbl_job ON tbl_archive.fk_job_id = pk_job_id
LEFT OUTER JOIN tbl_job_group ON tbl_job.fk_job_group_id =
pk_job_group_id
WHERE LOWER(file_name) LIKE LOWER('awstats%') AND
archive_complete = true AND job_group_type != 'R' GROUP BY
file_name, file_ctime, structure_path,
pk_computer_id, filetype_icon, computer_name,
share_name, share_path)UNION ALL(SELECT max(pk_file_structure_id) AS
pk_object_id, max(fk_archive_id) AS fk_archive_id,
structure_path_name AS
object_name,
structure_path AS object_path, computer_name,
structure_ctime AS
object_ctime, pk_computer_id, 'dir-open.gif' AS object_img, 'folder'
AS object_type, share_name, share_path FROM tbl_file_structure
JOIN tbl_acl ON
fk_file_structure_id = pk_file_structure_id
JOIN tbl_structure
ON pk_structure_id = fk_structure_id
JOIN tbl_archive ON
pk_archive_id = fk_archive_id
JOIN tbl_share ON
pk_share_id = fk_share_id
JOIN tbl_computer ON
pk_computer_id = fk_computer_id
LEFT OUTER JOIN
tbl_job ON tbl_archive.fk_job_id = pk_job_id
LEFT OUTER JOIN
tbl_job_group ON tbl_job.fk_job_group_id = pk_job_group_id
WHERE LOWER
(structure_path_name) LIKE LOWER('awstats%') AND archive_complete =
true AND fk_file_id IS NULL AND job_group_type != 'R' GROUP BY
structure_path_name, structure_ctime, structure_path,
pk_computer_id, computer_name, share_name, share_path) ORDER BY
object_name LIMIT 20 OFFSET 0
Limit (cost=2266221.90..2266221.95 rows=20 width=140) (actual
time=368409.873..368409.966 rows=20 loops=1)
-> Sort (cost=2266221.90..2270129.66 rows=1563107 width=140)
(actual time=368409.850..368409.892 rows=20 loops=1)
Sort Key: object_name
-> Append (cost=1734540.55..1816905.56 rows=1563107
width=140) (actual time=349422.914..368072.151 rows=14536 loops=1)
-> Subquery Scan *SELECT* 1
(cost=1734540.55..1816603.40 rows=1563102 width=140) (actual
time=349422.910..360586.872 rows=14532 loops=1)
-> GroupAggregate (cost=1734540.55..1800972.38
rows=1563102 width=140) (actual time=349422.892..360524.575
rows=14532 loops=1)
-> Sort (cost=1734540.55..1738448.30
rows=1563102 width=140) (actual time=349421.873..357658.685
rows=486179 loops=1)
Sort Key: tbl_file.file_name,
tbl_file.file_ctime, tbl_structure.structure_path,
tbl_computer.pk_computer_id, tbl_filetype.filetype_icon,
tbl_computer.computer_name, tbl_share.share_name, tbl_share.share_path
-> Hash Join
(cost=318514.39..1285224.77 rows=1563102 width=140) (actual
time=73276.810..318013.218 rows=486179 loops=1)
Hash Cond:
(tbl_archive.fk_job_id = tbl_job.pk_job_id)
-> Hash Join
(cost=318504.37..1263355.75 rows=1660796 width=148) (actual
time=73205.950..315847.893 rows=486179 loops=1)
Hash Cond:
(tbl_acl.fk_file_structure_id = tbl_file_structure.pk_file_structure_id)
-> Seq Scan on tbl_acl
(cost=0.00..563241.21 rows=28612321 width=8) (actual
time=9.164..128288.879 rows=26759522 loops=1)
-> Hash
(cost=308650.10..308650.10 rows=289942 width=148) (actual
time=68345.766..68345.766 rows=87777 loops=1)
-> Hash Join
(cost=63392.76..308650.10 rows=289942 width=148) (actual
time=32384.694..67853.749 rows=87777 loops=1)
Hash Cond:
(tbl_file.fk_filetype_id = tbl_filetype.pk_filetype_id)
-> Hash
Join (cost=63391.26..304661.90 rows=289942 width=145) (actual
time=32378.444..67472.130 rows=87777 loops=1)
Hash
Cond: (tbl_structure.fk_archive_id = tbl_archive.pk_archive_id)
->
Hash Join (cost=62832.48..300084.90 rows=298346 width=105) (actual
time=32106.191..66811.853 rows=87896 loops=1)
Hash Cond: (tbl_file_structure.fk_structure_id =
tbl_structure.pk_structure_id)
-
> Hash Join (cost=26628.01..248903.31 rows=298346 width=47)
(actual time=4149.775..56510.415 rows=87896 loops=1)
Hash Cond: (tbl_file_structure.fk_file_id = tbl_file.pk_file_id)
-> Seq Scan on tbl_file_structure (cost=0.00..105507.42
rows=4995142 width=24) (actual time=0.368..21066.207 rows=4648014
loops=1)
-> Hash (cost=25583.99..25583.99 rows=50161 width=39) (actual
time=4148.337..4148.337 rows=48870 loops=1)
-> Bitmap Heap Scan on tbl_file (cost=1935.72..25583.99
rows=50161 width=39) (actual time=1271.867..3905.037 rows=48870 loops=1)
Filter: (lower((file_name)::text) ~~ 'awstats%'::text)
-> Bitmap Index Scan on tbl_file_idx
(cost=0.00..1923.18 rows=42565 width=0) (actual
time=1254.230..1254.230 rows=89837 loops=1)
Index Cond: ((lower((file_name)::text) ~>=~
'awstats'::character varying) AND (lower((file_name)::text) ~<~
'awstatt'::character varying))
-
> Hash (cost=27090.10..27090.10 rows=361710 width=74) (actual
time=5599.792..5599.792 rows=318631 loops=1)
-> Seq Scan on tbl_structure (cost=0.00..27090.10 rows=361710
width=74) (actual time=15.301..4125.041 rows=318631 loops=1)
->
Hash (cost=557.92..557.92 rows=69 width=48) (actual
time=272.208..272.208 rows=54 loops=1)
-
> Hash Join (cost=6.47..557.92 rows=69 width=48) (actual
time=52.081..271.961 rows=54 loops=1)
Hash Cond: (tbl_share.fk_computer_id = tbl_computer.pk_computer_id)
-> Hash Join (cost=1.18..551.68 rows=69 width=37) (actual
time=47.672..267.298 rows=54 loops=1)
Hash Cond: (tbl_archive.fk_share_id = tbl_share.pk_share_id)
-> Index Scan using tbl_archive_pkey on tbl_archive
(cost=0.00..549.55 rows=69 width=24) (actual time=39.761..259.068
rows=54 loops=1)
Filter: archive_complete
-> Hash (cost=1.08..1.08 rows=8 width=29) (actual
time=7.838..7.838 rows=8 loops=1)
-> Seq Scan on tbl_share (cost=0.00..1.08 rows=8
width=29) (actual time=7.772..7.790 rows=8 loops=1)
-> Hash (cost=5.13..5.13 rows=13 width=19) (actual
time=4.362..4.362 rows=8 loops=1)
-> Seq Scan on tbl_computer (cost=0.00..5.13 rows=13
width=19) (actual time=0.329..4.319 rows=8 loops=1)
-> Hash
(cost=1.22..1.22 rows=22 width=19) (actual time=6.215..6.215 rows=22
loops=1)
->
Seq Scan on tbl_filetype (cost=0.00..1.22 rows=22 width=19) (actual
time=6.096..6.141 rows=22 loops=1)
-> Hash (cost=9.81..9.81
rows=16 width=8) (actual time=70.811..70.811 rows=16 loops=1)
-> Hash Join
(cost=4.38..9.81 rows=16 width=8) (actual time=66.016..70.759 rows=16
loops=1)
Hash Cond:
(tbl_job_group.pk_job_group_id = tbl_job.fk_job_group_id)
-> Seq Scan on
tbl_job_group (cost=0.00..5.21 rows=16 width=8) (actual
time=5.006..9.680 rows=16 loops=1)
Filter:
(job_group_type <> 'R'::bpchar)
-> Hash
(cost=4.17..4.17 rows=17 width=16) (actual time=60.943..60.943
rows=17 loops=1)
-> Seq Scan
on tbl_job (cost=0.00..4.17 rows=17 width=16) (actual
time=1.056..60.867 rows=17 loops=1)
-> Subquery Scan *SELECT* 2 (cost=302.03..302.16
rows=5 width=121) (actual time=7450.513..7450.544 rows=4 loops=1)
-> HashAggregate (cost=302.03..302.11 rows=5
width=121) (actual time=7450.497..7450.516 rows=4 loops=1)
-> Nested Loop (cost=6.27..301.90 rows=6
width=121) (actual time=319.925..7449.175 rows=94 loops=1)
-> Nested Loop (cost=6.27..209.37
rows=1 width=121) (actual time=249.894..5430.726 rows=24 loops=1)
-> Nested Loop
(cost=6.27..204.59 rows=1 width=110) (actual time=236.381..5416.033
rows=24 loops=1)
-> Nested Loop
(cost=6.27..200.24 rows=1 width=118) (actual time=235.827..5414.547
rows=24 loops=1)
-> Nested Loop
(cost=6.27..196.18 rows=1 width=118) (actual time=204.173..5369.710
rows=24 loops=1)
-> Nested
Loop (cost=6.27..17.17 rows=1 width=118) (actual
time=128.586..1972.271 rows=24 loops=1)
Join
Filter: (tbl_share.pk_share_id = tbl_archive.fk_share_id)
->
Nested Loop (cost=6.27..15.99 rows=1 width=105) (actual
time=116.772..1958.954 rows=24 loops=1)
-
> Index Scan using tbl_structure_idx on tbl_structure
(cost=0.01..6.71 rows=1 width=89) (actual time=86.271..1375.178
rows=24 loops=1)
Index Cond: ((lower((structure_path_name)::text) >=
'awstats'::text) AND (lower((structure_path_name)::text) <
'awstatt'::text))
Filter: (lower((structure_path_name)::text) ~~ 'awstats%'::text)
-
> Bitmap Heap Scan on tbl_archive (cost=6.26..9.27 rows=1
width=24) (actual time=24.286..24.292 rows=1 loops=24)
Recheck Cond: (tbl_archive.pk_archive_id =
tbl_structure.fk_archive_id)
Filter: archive_complete
-> Bitmap Index Scan on tbl_archive_pkey (cost=0.00..6.26
rows=1 width=0) (actual time=3.642..3.642 rows=1 loops=24)
Index Cond: (tbl_archive.pk_archive_id =
tbl_structure.fk_archive_id)
->
Seq Scan on tbl_share (cost=0.00..1.08 rows=8 width=29) (actual
time=0.501..0.517 rows=8 loops=24)
-> Index
Scan using tbl_file_structure_idx1 on tbl_file_structure
(cost=0.00..178.96 rows=4 width=16) (actual time=131.843..141.509
rows=1 loops=24)
Index
Cond: (tbl_structure.pk_structure_id =
tbl_file_structure.fk_structure_id)
Filter: (fk_file_id IS NULL)
-> Index Scan
using tbl_job_pkey on tbl_job (cost=0.00..4.05 rows=1 width=16)
(actual time=1.844..1.848 rows=1 loops=24)
Index Cond:
(tbl_archive.fk_job_id = tbl_job.pk_job_id)
-> Index Scan using
tbl_job_group_pkey on tbl_job_group (cost=0.00..4.33 rows=1 width=8)
(actual time=0.044..0.049 rows=1 loops=24)
Index Cond:
(tbl_job.fk_job_group_id = tbl_job_group.pk_job_group_id)
Filter:
(job_group_type <> 'R'::bpchar)
-> Index Scan using
tbl_computer_pkey on tbl_computer (cost=0.00..4.77 rows=1 width=19)
(actual time=0.580..0.584 rows=1 loops=24)
Index Cond:
(tbl_computer.pk_computer_id = tbl_share.fk_computer_id)
-> Index Scan using tbl_acl_idx on
tbl_acl (cost=0.00..91.93 rows=48 width=8) (actual
time=84.054..84.078 rows=4 loops=24)
Index Cond:
(tbl_acl.fk_file_structure_id = tbl_file_structure.pk_file_structure_id)
Total runtime: 368592.811 ms