On Cum, 2005-08-26 at 19:31 -0400, Tom Lane wrote:
> I think the problem is that the planner is underestimating the cost of
> evaluating this complicated filter condition:
>
> > -> Seq Scan on scf_irsaliye irs (cost=0.00..30.00 rows=1 width=20) (actual time=0.290..0.290
rows=0loops=10000)
> > Filter: (((karsifirma)::text = 'KENDI'::text) AND (((turu)::text = 'MAI'::text) OR
((turu)::text= 'KGI'::text) OR ((turu)::text = 'PS'::text) OR ((turu)::text = 'TS'::text) OR ((turu)::text =
'KC'::text)OR ((turu)::text = 'KCO'::text)) AND (((_key_sis_depo_dest)::text = '$$$$0000003l$1$$'::text) OR
((_key_sis_depo_dest)::text= '$$$$00000048$1$$'::text) OR ((_key_sis_depo_dest)::text = '$$$$0000004b$1$$'::text) OR
((_key_sis_depo_dest)::text= '$$$$0000004d$1$$'::text)) AND (tarih <= '2005-08-26'::date))
>
> While you could attack that by raising the cpu_operator_cost parameter,
> it would also be worth inquiring *why* the condition is so expensive to
> evaluate. I am suspicious that you are running the database in a locale
> in which strcoll() is really slow. Can you run it in C locale instead,
> or do you really need locale-aware behavior? Can you switch to a
> different database encoding? (A single-byte encoding such as Latin1
> might be faster than UTF8, for example.)
Yes, you are perfectly right. We are using UTF8 and tr_TR.UTF8 locale.
However, I tried the same tests with latin1 and C locale, it is surely
faster, but not dramatically. i.e.:
Before Vacuum After Vacuum
UTF8 and tr_TR.UTF8: ~8 s ~110 s
latin1 and C: ~7 s ~65 s
I also played with cpu_operator_cost parameter and it dramatically
reduced query times, but not to the level before vacuum:
Before Vacuum After Vacuum
UTF8 and tr_TR.UTF8: ~8 s ~11 s
latin1 and C: ~7 s ~9 s
These values are much better but I really wonder if I can reach the
performance levels before vacuum. I am also worried about the
side-effects that may be caused by the non-default cpu_operator_cost
parameter.
> Another possibility is to take a hard look at whether you can't simplify
> the filter condition, but that'd require more knowledge of your
> application than I have.
Yes that is another option, we are even considering schema changes to
use less character types, but these are really costly and error-prone
operations at the moment.
> Or you could just play with the order of the filter conditions ... for
> example, the date condition at the end is probably far cheaper to test
> than the text comparisons, so if that's fairly selective it'd be worth
> putting it first.
We are experimenting on this.
Thanks your help!
Best Regards,
Umit Oztosun