Bug in query planer ? - Mailing list pgsql-bugs
| From | Clifford Wolf |
|---|---|
| Subject | Bug in query planer ? |
| Date | |
| Msg-id | 200601311859.42586.clifford.wolf@linbit.com Whole thread Raw |
| Responses |
Re: Bug in query planer ?
|
| List | pgsql-bugs |
Hi,
philipp resiner wrote a mail about this problem yesterday. I've now traced =
it down to something that looks like a bug in the
query planer to me. Please have a look at this and let me know if this is a=
bug or I am compleatly wrong..
(this is done right after a complete ANALYZE over the database, so the stat=
istics are up-to-date)
sd-beta=3D> select n_tup_ins, n_tup_del from pg_stat_user_tables where rel=
name =3D 'contractelements';
n_tup_ins | n_tup_del
-----------+-----------
91821 | 0
(1 row)
sd-beta=3D> select n_distinct, most_common_vals, most_common_freqs from pg=
_stats where tablename =3D 'contractelements' and attname =3D 'isactiv';
n_distinct | most_common_vals | most_common_freqs
------------+------------------+----------------------
2 | {Y,N} | {0.966467,0.0335333}
(1 row)
sd-beta=3D> explain analyze select 1 from contractelements where isActiv =
=3D 'Y';
QUERY PLAN
--------------------------------------------------------------------------=
---------------------------------------------
Seq Scan on contractelements (cost=3D0.00..4963.76 rows=3D88742 width=3D=
0) (actual time=3D0.014..137.930 rows=3D88838 loops=3D1)
Filter: ((isactiv)::text =3D 'Y'::text)
Total runtime: 153.543 ms
(3 rows)
The query planner estimates that isActiv =3D 'Y' will match 88742 rows. Thi=
s is reasonable (91821 * 0.966467 =3D 88741.966407) and
correct. However, the following case causes some troubles:
sd-beta=3D> explain analyze select 1 from contractelements where upper(isA=
ctiv) =3D 'Y';
QUERY PLAN
--------------------------------------------------------------------------=
-------------------------------------------
Seq Scan on contractelements (cost=3D0.00..5193.32 rows=3D459 width=3D0)=
(actual time=3D0.030..198.493 rows=3D88838 loops=3D1)
Filter: (upper((isactiv)::text) =3D 'Y'::text)
Total runtime: 214.035 ms
(3 rows)
Here we match on upper(isActiv) =3D 'Y' (which is totally braindead, but th=
e query is auto-generated by a customer-supplied
application, so I can not change it). Shouldn't the query planner execute u=
pper(isActiv) for both values in pg_stats
and so come to the same conclusion as in the first case?
It doesn't. Led by this misapprehension the query planner generates pretty =
creative, but unfortunately very suboptimal
query plans.
A 'CREATE INDEX clifford_temp ON contractelements ( upper(isActiv) )' follo=
wed by an 'ANALYZE contractelements' solves the
problem in this particular case. But this is not a solution to the problem =
in general..
Shouldn't the query planner be able to do the right thing without the index=
? Where does the magic 'rows=3D459' come from?
yours,
- clifford
--=20
: Clifford Wolf =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Tel +=
43-1-8178292-00 :
: LINBIT Information Technologies GmbH =A0 =A0 =A0 =A0 =A0Fax +43-1-8178292=
-82 :
: Sch=F6nbrunnerstr 244, 1120 Vienna, Austria =A0 =A0http://www.linbit.com :
pgsql-bugs by date: