Re: [GENERAL] Problems with inequalities on numeric fields in 6.5 - Mailing list pgsql-general
From | Martin Weinberg |
---|---|
Subject | Re: [GENERAL] Problems with inequalities on numeric fields in 6.5 |
Date | |
Msg-id | 199907081544.LAA01597@osprey.phast.umass.edu Whole thread Raw |
In response to | Re: [GENERAL] Problems with inequalities on numeric fields in 6.5 (Bruce Momjian <maillist@candle.pha.pa.us>) |
Responses |
Re: [GENERAL] Problems with inequalities on numeric fields in 6.5
|
List | pgsql-general |
Bruce Momjian wrote on Wed, 07 Jul 1999 12:29:13 EDT >> Thanks, Bruce! >> >> Yes, I tried the latter query and it's the same: >> >> -------------------------------------------------- >> >> lmc=> explain select count(*) from lmctot where j_m>3.4::float4 and j_m<3.5: >:float4; >> NOTICE: QUERY PLAN: >> >> Aggregate (cost=62349.97 rows=788100 width=4) >> -> Index Scan using j on lmctot (cost=62349.97 rows=788100 width=4) >> >> EXPLAIN >> >> -------------------------------------------------- >> I've tried all permutations of the conversions in the ranges with >> similar results (and vacuum analyzed several times as well as >> dumped and reloaded and reloaded from scracth). We have >> a larger database with 20M rows which has a similar behavior. >> >> There are 7092894 rows in database "lmc". So: >> >> (3.5-3.4)/(99.999-2.731) = 7292.1 != 788k >> >> A clue? > >I have just fixed a problem with index size estimates. Try adding >#include <math.h> to the top of backend/optimizer/util/plancat.c. That >may fix the estimated number of tuples returned. However, it don't >think you are going to get better performance, since you are already >using the index in the above case. The only big win I can think of is >to use CLUSTER on that field. That should speed things up quite a bit. > Hi Bruce, Ok. Sorry about the delay. I added the math.h but that doesn't seem to change the query plan output. I then dropped all the indices, made a new one on three of the variables and clustered: create index m_col on lmctot using btree (j_m, h_m, k_m); cluster m_col on lmctot; vacuum analyze; where the j_m, h_m, k_m are three float4 fields. The cluster took about 18 hours on my 7.1 million records (this is a dual 450Mhz Xeon Linux box). Not sure why this was so slow. Anyway, this *hugely* improved queries of form: select count(*) from lmctot where j_m>3.4::float4 and j_m<3.5: although the explain query plan output is identical. However using h_m or k_m (not the first variable in the index) appears to be doing a sequential scan. Is that right? I then made indices on h_m and k_m, vacuum analyzed and tried again, but got identical performance. If this is the way it is, so be it, but I have the feeling that something is not working properly. Any ideas? Again, with _heaps_ of thanks, --Martin =========================================================================== Martin Weinberg Phone: (413) 545-3821 Dept. of Physics and Astronomy FAX: (413) 545-2117/0648 530 Graduate Research Tower University of Massachusetts Amherst, MA 01003-4525
pgsql-general by date: