Re: POSTGRES DB 3 800 000 rows table, speed up? - Mailing list pgsql-general

From Tom Lane
Subject Re: POSTGRES DB 3 800 000 rows table, speed up?
Date
Msg-id 21118.1135796668@sss.pgh.pa.us
Whole thread Raw
In response to Re: POSTGRES DB 3 800 000 rows table, speed up?  ("Jim C. Nasby" <jnasby@pervasive.com>)
Responses Re: POSTGRES DB 3 800 000 rows table, speed up?
List pgsql-general
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> On Tue, Dec 27, 2005 at 11:25:37PM +0200, Eugene wrote:
>> I ask db like this  SELECT * FROM ipdb2 WHERE '3229285376' BETWEEN ipfrom
>> AND ipto;

> I'm pretty sure PostgreSQL won't be able to use any indexes for this
> (EXPLAIN ANALYZE would verify that). Instead, expand the between out:

> WHERE ipfrom >= '...' AND ipto <= '...'

That won't help (it is in fact exactly the same query, because BETWEEN
is just rewritten into that).  The real problem is that btree indexes
are ill-suited to this type of condition.  If the typical row has only
a small distance between ipfrom and ipto then the query is actually
pretty selective, but there is no way to capture that selectivity in
a btree search, because neither of the single-column comparisons are
selective at all.  The planner realizes this and doesn't bother with
the index, instead it just does a seqscan.

You could probably get somewhere by casting the problem as an rtree
or GIST overlap/containment query, but with the currently available
tools it would be a pretty unnatural-looking query ... probably
something like
    box(point(ipfrom,ipfrom),point(ipto,ipto)) ~
    box(point(3229285376,3229285376),point(3229285376,3229285376))
after creating an rtree or GIST index on
    box(point(ipfrom,ipfrom),point(ipto,ipto))
(haven't tried this but there is a solution lurking somewhere in this
general vicinity).

Is there a good reason why the data is stored this way, and not as
say a single "cidr" column containing subnet addresses?  Querying
    WHERE '192.122.252.0' << cidrcolumn
would be a much more transparent way of expressing your problem.
We don't currently have an easy indexing solution for that one either,
but we might in the future.

            regards, tom lane

pgsql-general by date:

Previous
From: hubert depesz lubaczewski
Date:
Subject: Re: POSTGRES DB 3 800 000 rows table, speed up?
Next
From: vishal saberwal
Date:
Subject: C Function Problem for bytea output