Re: Optimizer choosing smaller index instead of right one - Mailing list pgsql-general
From | Ian Ribas |
---|---|
Subject | Re: Optimizer choosing smaller index instead of right one |
Date | |
Msg-id | 409287F8.5040209@digitro.com.br Whole thread Raw |
In response to | Re: Optimizer choosing smaller index instead of right one (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-general |
Effectively, I hadn't seen the cast being done on the field dt_finalizacao! The query actually runs inside a pgplsql procedure, so the parameters are unknown constants (variables), but the situation got much much better when I forced a cast on the constant part of the condition (as follows), so allowing the planner to correctly choose an index : explain analyze SELECT dt_finalizacao FROM cham_servico WHERE id_chave_grupo = '7458' AND cod_bxs = 1 AND dt_finalizacao > CAST(to_timestamp ('2004-04-20','YYYY-MM-DD') AS TIMESTAMP(0) WITHOUT TIME ZONE) AND dt_finalizacao < CAST(now() AS TIMESTAMP(0) WITHOUT TIME ZONE) ORDER BY dt_finalizacao ASC LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Limit (cost=0.00..364.98 rows=1 width=8) (actual time=52.73..52.90 rows=1 loops=1) -> Index Scan using xie1cham_servico on cham_servico (cost=0.00..9756.63 rows=27 width=8) (actual time=52.73..52.89 rows=2 loops=1) Index Cond: ((dt_finalizacao > (to_timestamp('2004-04-20'::text, 'YYYY-MM-DD'::text))::timestamp(0) without time zone) AND (dt_finalizacao < (now())::timestamp(0) without time zone)) Filter: ((id_chave_grupo = '7458'::character varying) AND (cod_bxs = 1)) Total runtime: 53.11 msec (5 rows) Unfortunately, I really must use timestamp without time zone for some other unrelated reasons ... Thank you very much for your attention Best wishes, Ian. Tom Lane wrote: >Ian Ribas <ian.ribas@digitro.com.br> writes: > > >>The index I created to try to optimize it is "xie3cham_servico", and >>has all the three columns of the table that are used in the query. But >>the index being used is actually xie2cham_servico (as can be seen >>below) which has only two columns, one of them completely useless for >>the query! I'm probably missing some setting, but I couldn't figure >>out which. >> >> > >The optimizer's cost equations do say that a smaller index should be >preferred over a larger one if the number of rows obtainable from the >index is comparable (which I trust you will agree is reasonable). >I think the real issue here is a datatype conflict. Look closely >at the filter conditions: > > > >> -> Index Scan using xie2cham_servico on cham_servico (cost=0.00..15663.15 rows=26 width=8) (actual time=13604.37..13980.16rows=3173 loops=1) >> Index Cond: (id_chave_grupo = '7458'::character varying) >> Filter: ((cod_bxs = 1) AND ((dt_finalizacao)::timestamp with time zone > to_timestamp('2004-04-20'::text,'YYYY-MM-DD'::text)) AND ((dt_finalizacao)::timestamp with time zone < now())) >> >> > >The two comparisons on dt_finalizacao are unindexable because they >coerce it to a different datatype; the "<" and ">" are the operators on >timestamp with time zone and so they are not relevant to an index on >timestamp without time zone. > >Very likely your column really ought to be of type timestamp with time >zone. If you are convinced it should be timestamp without time zone >then you need to fix the values you are comparing to. > >I'm also a bit disturbed by the fact that the rows estimates are off by >more than two orders of magnitude (26 est vs 3173 actual). It would >perhaps help if you wrote the boundary date as a simple constant: > > ... AND dt_finalizacao > '2004-04-20' AND ... > >so that the planner had some chance of comparing the constant to its >statistics about the distribution of dt_finalizacao. to_timestamp is >not a constant-foldable function (because it depends on the current >TimeZone setting) and so the query as you've written it looks like > > ... AND dt_finalizacao > unpredictable_expression AND ... > >as far as the planner knows. (I'm not sure this will help though; >it may be that the big problem is that the three columns are not >independent, which is something the planner doesn't know because it has >no cross-column statistics.) > > regards, tom lane > > >
pgsql-general by date: