Home > mailing lists

Re: POSTGRES DB 3 800 000 rows table, speed up? - Mailing list pgsql-general

From	Tom Lane
Subject	Re: POSTGRES DB 3 800 000 rows table, speed up?
Date	December 29, 2005 02:21:44
Msg-id	3473.1135826504@sss.pgh.pa.us Whole thread Raw
In response to	Re: POSTGRES DB 3 800 000 rows table, speed up? (Klein Balázs <Balazs.Klein@axelero.hu>)
List	pgsql-general

Tree view

=?iso-8859-1?Q?Klein_Bal=E1zs?= <Balazs.Klein@axelero.hu> writes:
> Could you explain this a little bit more?
> What are the conditions of this situation that makes b-tree ineffective?

Well, what he's trying to do is (abstracting a little)

    WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col

Given a btree index on low_bound_col, the best you could do with this is
scan all the index entries from the start of the index up to probe_value
... or about half the table, on average, which makes the index pretty
much useless.  On the assumption that low_bound_col and high_bound_col
are usually close together, all of the useful hits will occur near the
end of that scan, or the beginning if you scan backwards --- but there's
no way to know when it's OK to stop looking.

Making a double-column index on (low_bound_col, high_bound_col) does
not improve the situation much, because the additional condition
high_bound_col >= probe_value doesn't let you avoid scanning small
values of low_bound_col.  You might save some trips to the table proper
but you're still scanning half the index.

And of course indexing (high_bound_col, low_bound_col) isn't any better.

If you are willing to impose a hard-wired assumption about the possible
size of the low-bound-to-high-bound distance, you can extend the query
to something like

    WHERE low_bound_col <= probe_value AND probe_value <= high_bound_col
    AND low_bound_col >= (probe_value - max_distance)

which creates an efficiently indexable range limitation on
low_bound_col.  Of course this is a very sucky kluge.

You can do a lot better with index types that are designed for
two-dimensional data instead of one-dimensional data.  Btree is
a great data structure for one-dimensional searches, but that
doesn't make it the answer to everything.

            regards, tom lane

pgsql-general by date:

From: "Jonel Rienton"
Date: 29 December 2005, 01:44:22
Subject: Re: [Bulk] Re: Final stored procedure question, for now anyway

From: "Qingqing Zhou"
Date: 29 December 2005, 02:54:37
Subject: Re: I want to know how to improve the security of postgresql

Re: POSTGRES DB 3 800 000 rows table, speed up? - Mailing list pgsql-general

Previous

Next