Re: POSTGRES DB 3 800 000 rows table, speed up? - Mailing list pgsql-general

From Tom Lane
Subject Re: POSTGRES DB 3 800 000 rows table, speed up?
Date
Msg-id 3473.1135826504@sss.pgh.pa.us
Whole thread Raw
In response to Re: POSTGRES DB 3 800 000 rows table, speed up?  (Klein Balázs <Balazs.Klein@axelero.hu>)
List pgsql-general
=?iso-8859-1?Q?Klein_Bal=E1zs?= <Balazs.Klein@axelero.hu> writes:
> Could you explain this a little bit more?
> What are the conditions of this situation that makes b-tree ineffective?

Well, what he's trying to do is (abstracting a little)

    WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col

Given a btree index on low_bound_col, the best you could do with this is
scan all the index entries from the start of the index up to probe_value
... or about half the table, on average, which makes the index pretty
much useless.  On the assumption that low_bound_col and high_bound_col
are usually close together, all of the useful hits will occur near the
end of that scan, or the beginning if you scan backwards --- but there's
no way to know when it's OK to stop looking.

Making a double-column index on (low_bound_col, high_bound_col) does
not improve the situation much, because the additional condition
high_bound_col >= probe_value doesn't let you avoid scanning small
values of low_bound_col.  You might save some trips to the table proper
but you're still scanning half the index.

And of course indexing (high_bound_col, low_bound_col) isn't any better.

If you are willing to impose a hard-wired assumption about the possible
size of the low-bound-to-high-bound distance, you can extend the query
to something like

    WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col
    AND low_bound_col >= (probe_value - max_distance)

which creates an efficiently indexable range limitation on
low_bound_col.  Of course this is a very sucky kluge.

You can do a lot better with index types that are designed for
two-dimensional data instead of one-dimensional data.  Btree is
a great data structure for one-dimensional searches, but that
doesn't make it the answer to everything.

            regards, tom lane

pgsql-general by date:

Previous
From: "Jonel Rienton"
Date:
Subject: Re: [Bulk] Re: Final stored procedure question, for now anyway
Next
From: "Qingqing Zhou"
Date:
Subject: Re: I want to know how to improve the security of postgresql