Thread: B-tree performance improvements in 8.x

B-tree performance improvements in 8.x

From

jao@geophile.com

Date:

07 February 2006, 18:39:10

I'm currently running postgresql 7.4.8. Our two biggest tables have
indexes on (x, y), where x, y are integers, and there are often
many y values per x value. The ratio can be anywhere from 1:1 to 200000:1.

In the 8.0 release notes, (section E.10.4.1), I noticed this
statement:

    Improve B-tree index performance for duplicate keys (Dmitry Tkach, Tom)

    This improves the way indexes are scanned when many duplicate
    values exist in the index.

Can someone describe these improvements in more detail? How were such
scans done in 7.x, and what exactly is different in 8.x?

Jack Orenstein

Re: B-tree performance improvements in 8.x

From

Tom Lane

Date:

07 February 2006, 18:54:30

jao@geophile.com writes:
> In the 8.0 release notes, (section E.10.4.1), I noticed this
> statement:

>     Improve B-tree index performance for duplicate keys (Dmitry Tkach, Tom)

>     This improves the way indexes are scanned when many duplicate
>     values exist in the index.

> Can someone describe these improvements in more detail? How were such
> scans done in 7.x, and what exactly is different in 8.x?

IIRC, this change had to do with the initial-positioning rules for btree
index descent.  Before 8.0, the tree descent code had only one behavior:
"find first index entry >= target value X".  If the query was "WHERE
indexcol > X" then the descent would go to the first entry equal to X
(if any) and then, if it was equal, step forward to the first entry
greater than X.  This could be quite slow if there were many entries
equal to X.  Conversely, for a backward scan, the initial positioning
was OK for query "indexcol < X": go to first entry >= X, step back one;
but not so hot for "indexcol <= X": go to first entry >= X, step forward
over all entries = X, step back one.  In 8.0, the descent code can do
either "first entry >= X" or "first entry > X", and the positioning
rules never need to step more than one entry to locate the desired
starting position (details left as exercise for the reader).  The impact
of having N equal entries is now only O(log N) rather than O(N), since
tree descent is O(log N) in either case.

            regards, tom lane

Re: B-tree performance improvements in 8.x

From

jao@geophile.com

Date:

07 February 2006, 19:54:30

Quoting Tom Lane <tgl@sss.pgh.pa.us>:

> jao@geophile.com writes:
>> In the 8.0 release notes, (section E.10.4.1), I noticed this
>> statement:
>
>>     Improve B-tree index performance for duplicate keys (Dmitry Tkach, Tom)
>
>>     This improves the way indexes are scanned when many duplicate
>>     values exist in the index.
>
>> Can someone describe these improvements in more detail? How were such
>> scans done in 7.x, and what exactly is different in 8.x?
>
> IIRC, this change had to do with the initial-positioning rules for btree
> index descent.  Before 8.0, the tree descent code had only one behavior:
> "find first index entry >= target value X".  If the query was "WHERE
> indexcol > X" then the descent would go to the first entry equal to X
> (if any) and then, if it was equal, step forward to the first entry
> greater than X.  This could be quite slow if there were many entries
> equal to X.
> ...
> In 8.0, the descent code can do
> either "first entry >= X" or "first entry > X", and the positioning
> rules never need to step more than one entry to locate the desired
> starting position (details left as exercise for the reader).

What about a compound index (x, y), in which there are many y values
for a given x? My query is "... WHERE x = ? and y = ?".

If the b-tree treats the compound index key is treated as a single
value, then it seems like the pre-8.0 logic you describe would not
apply, and I would expect O(log n) access to the record of interest.

But if the pre-8.0 b-tree starts at the first value >= x and does a
linear scan for the y value, I'd have a problem.

Can you comment on how 7.x and 8.x handle my index and query?

Thanks in advance.

Jack Orenstein

Re: B-tree performance improvements in 8.x

From

Tom Lane

Date:

07 February 2006, 21:13:32

jao@geophile.com writes:
> Quoting Tom Lane <tgl@sss.pgh.pa.us>:
>> In 8.0, the descent code can do
>> either "first entry >= X" or "first entry > X", and the positioning
>> rules never need to step more than one entry to locate the desired
>> starting position (details left as exercise for the reader).

> What about a compound index (x, y), in which there are many y values
> for a given x? My query is "... WHERE x = ? and y = ?".

Doesn't really matter whether the key is simple or compound --- all
that matters is whether you have multiple entries that are "equal" to
the boundary value.

            regards, tom lane

Re: B-tree performance improvements in 8.x

From

Dick Kniep

Date:

07 February 2006, 21:47:33

Hi list,

Does this also affect if you have many NULL values in the key? So testing Not
is NULL would also be affected?

Cheers,
Dick Kniep

On Tuesday 07 February 2006 23:13, Tom Lane wrote:
> jao@geophile.com writes:
> > Quoting Tom Lane <tgl@sss.pgh.pa.us>:
> >> In 8.0, the descent code can do
> >> either "first entry >= X" or "first entry > X", and the positioning
> >> rules never need to step more than one entry to locate the desired
> >> starting position (details left as exercise for the reader).
> >
> > What about a compound index (x, y), in which there are many y values
> > for a given x? My query is "... WHERE x = ? and y = ?".
>
> Doesn't really matter whether the key is simple or compound --- all
> that matters is whether you have multiple entries that are "equal" to
> the boundary value.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

Re: B-tree performance improvements in 8.x

From

Tom Lane

Date:

08 February 2006, 04:18:18

Dick Kniep <dick@kniep.nl> writes:
> Does this also affect if you have many NULL values in the key? So testing Not
> is NULL would also be affected?

IS NOT NULL isn't an indexable operation, so your question doesn't really
apply :-(

            regards, tom lane

Re: B-tree performance improvements in 8.x

From

Dick Kniep

Date:

08 February 2006, 16:02:14

On Wednesday 08 February 2006 06:18, Tom Lane wrote:
> Dick Kniep <dick@kniep.nl> writes:
> > Does this also affect if you have many NULL values in the key? So testing
> > Not is NULL would also be affected?
>
> IS NOT NULL isn't an indexable operation, so your question doesn't really
> apply :-(

Does this mean that if you have a table that has many rows, and 95% of the
rows contain a NULL value for a field, that indexing will be useless, because
it will always do a tablescan?

Hope not!

Dick

Re: B-tree performance improvements in 8.x

From

Greg Stark

Date:

08 February 2006, 17:45:56

Dick Kniep <dick@kniep.nl> writes:

> Does this mean that if you have a table that has many rows, and 95% of the
> rows contain a NULL value for a field, that indexing will be useless, because
> it will always do a tablescan?

Any time you have 95% of the rows of the table with the same value you're
probably going to have to think harder than usual about indexing strategies.
You might consider constructing your index with "WHERE a IS NOT NULL" since
otherwise 95% of your index is going to be full of records that aren't
selective enough to be useful anyways.

If you access your table with "WHERE a = ?" then it will undoubtedly use the
index effectively. You're accessing less than 5% (presumably much less), so
the index is useful.

--
greg

Re: B-tree performance improvements in 8.x

From

Martijn van Oosterhout

Date:

08 February 2006, 19:26:24

On Wed, Feb 08, 2006 at 06:02:08PM +0100, Dick Kniep wrote:
> On Wednesday 08 February 2006 06:18, Tom Lane wrote:
> > Dick Kniep <dick@kniep.nl> writes:
> > > Does this also affect if you have many NULL values in the key? So testing
> > > Not is NULL would also be affected?
> >
> > IS NOT NULL isn't an indexable operation, so your question doesn't really
> > apply :-(
>
> Does this mean that if you have a table that has many rows, and 95% of the
> rows contain a NULL value for a field, that indexing will be useless, because
> it will always do a tablescan?

Well, if 95% of a table is NULL then an index scan is useless anyway.

To the more general question, IS NULL is not an indexable operator. The
btree code works on binary operators and IS NULL isn't one. I submitted
a patch a while ago to make it indexable by creating a special scankey
type for them but it didn't get much discussion[1]. No-one has come up
with any other approach yet AFAIK.

I certainly hope indexing NULLs will get in eventually one way or the
other. It's kind of odd to have to create two indexes on the same
column (one partial) just to speed up IS NULL queries.

[1] http://archives.postgresql.org/pgsql-patches/2005-09/msg00093.php

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

signature.asc