6 aug 2007 kl. 17:31 skrev Tom Lane:
> Henrik Zagerholm <henke@mac.se> writes:
>> WHERE file_indexed IS FALSE
>> AND file_copied
>> IS TRUE
>> AND file_size <
>> (1024)
>> AND LOWER
>> (file_suffix) IN(
>> SELECT LOWER
>> (filetype_suffix) FROM tbl_filetype_suffix WHERE
>> filetype_suffix_index IS TRUE
>> ) AND fk_archive_id
>> = 115 ORDER BY fk_tar_id
>
> Do you really need the lower() calls there? The planner is getting
> the
> wrong estimate for the selectivity of the IN-clause, which is likely
> because it has no statistics about lower(file_suffix) or
> lower(filetype_suffix).
>
> If you don't want to constrain the data to be already lower'd, then
> setting up functional indexes on the two lower() expressions should
> prompt ANALYZE to track stats for them.
>
OK, thanx for the tip. I actually think that all the suffixes are
lower case so the lower should go.
But would this really impact the sequential scan on tbl_file_structure?
->Seq Scan on tbl_file_structure (cost=0.00..167417.09 rows=7902309
width=16) (actual time=9.581..33702.852 rows=7801334 loops=1)
At what point does the planner choose seq scans? I've seen the
planner use seq scan even though only 1% of the joining tables rows
are selected.
If the filter gives me 70k rows from tbl_file and tbl_file_structure
has 8 million rows why does the planner choose seq scans?
Cheers,
Henrik
> regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate