Re: Normal case or bad query plan? - Mailing list pgsql-performance

From Tom Lane
Subject Re: Normal case or bad query plan?
Date
Msg-id 28411.1097559915@sss.pgh.pa.us
Whole thread Raw
In response to Re: Normal case or bad query plan?  (Gabriele Bartolini <angusgb@tin.it>)
List pgsql-performance
Gabriele Bartolini <angusgb@tin.it> writes:
> Seq Scan on ip2location  (cost=0.00..30490.65 rows=124781 width=8)
> (actual time=5338.120..40237.283 rows=1 loops=1)
> Filter: ((1040878301::bigint >= ip_address_from) AND
> (1040878301::bigint <= ip_address_to))
> Total runtime: 40237.424 ms
>>
>> I suspect that you haven't ANALYZEd this table in a long time, if ever.
>> You really need reasonably up-to-date ANALYZE stats if you want the
>> planner to do an adequate job of planning range queries.

> That's the thing ... I had just peformed a VACUUM ANALYSE   :-(

In that case I think Kris Jurka had it right: the problem is the planner
doesn't know enough about the relationship of the ip_address_from and
ip_address_to columns to realize that this is a very selective query.
But actually, even *had* it realized that, it would have had little
choice but to use a seqscan, because neither of the independent
conditions is really very useful as an index condition by itself.

Assuming that this problem is representative of your query load, you
really need to recast the data representation to make it more readily
searchable.  I think you might be able to get somewhere by combining
ip_address_from and ip_address_to into a single CIDR column and then
using the network-overlap operator to probe for matches to your query
address.  (This assumes that the from/to pairs are actually meant to
represent CIDR subnets; if not you need some other idea.)  Another
possibility is to convert to a geometric type and use an rtree index
with an "overlaps" operator.  I'm too tired to work out the details,
but try searching for "decorrelation" in the list archives to see some
related problems.

            regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: why my query is not using index??
Next
From: Matt Clark
Date:
Subject: Re: IBM P-series machines