Thread: multi-column index

multi-column index

From

Daniel Crisan

Date:

16 March 2005, 19:09:14

Hello.

I have a problem concerning multi-column indexes.

I have a table containing some 250k lines.

Table "public.descriptionprodftdiclnk"
   Column    |  Type   | Modifiers
-------------+---------+-----------
 idword      | integer | not null
 idqualifier | integer | not null
Indexes:
    "descriptionprodftdiclnk_pkey" primary key, btree (idword, idqualifier)
    "ix_descriptionprodftdiclnk_idqualif" btree (idqualifier)

When analyzing a simple query on the idword column the query planner
displays:

explain analyze select * from descriptionprodftdiclnk where idword=44;
                                                          QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on descriptionprodftdiclnk  (cost=0.00..4788.14 rows=44388
width=8) (actual time=87.582..168.041 rows=43792 loops=1)
   Filter: (idword = 44)
 Total runtime: 195.339 ms
(3 rows)

I don't understand why the query planner would not use the default
created multi-column index
on the primary key. According to the Postgres online documentation it
should. By setting the
"enable_seqscan" parameter to off, i can force the planner to use the index:

explain analyze select * from descriptionprodftdiclnk where idword=44;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using descriptionprodftdiclnk_pkey on
descriptionprodftdiclnk  (cost=0.00..36720.39 rows=44388 width=8)
(actual time=0.205..73.489 rows=43792 loops=1)
   Index Cond: (idword = 44)
 Total runtime: 100.564 ms
(3 rows)



On the other hand, by defining a new index on the idword column (and
"enable_seqscan" set to on),
the query uses the index:

create index ix_tempIndex on descriptionprodftdiclnk(idword);
CREATE INDEX
explain analyze select * from descriptionprodftdiclnk where idword=44;
                                                                   QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using ix_tempindex on descriptionprodftdiclnk
(cost=0.00..916.24 rows=44388 width=8) (actual time=0.021..79.879
rows=43792 loops=1)
   Index Cond: (idword = 44)
 Total runtime: 107.081 ms
(3 rows)

Could someone provide an explanation for the planner's behaviour?

Thanks for your help,
Daniel

Re: multi-column index

From

Josh Berkus

Date:

16 March 2005, 21:10:39

Daniel,

> Table "public.descriptionprodftdiclnk"

What is this, German?  ;-)

> explain analyze select * from descriptionprodftdiclnk where idword=44;
>                                                           QUERY PLAN
> ---------------------------------------------------------------------------
>---------------------------------------------------- Seq Scan on
> descriptionprodftdiclnk  (cost=0.00..4788.14 rows=44388 width=8) (actual
> time=87.582..168.041 rows=43792 loops=1)
>    Filter: (idword = 44)
>  Total runtime: 195.339 ms
> (3 rows)

> explain analyze select * from descriptionprodftdiclnk where idword=44;
>
> QUERY PLAN
> ---------------------------------------------------------------------------
>----------------------------------------------------------------------------
>------------ Index Scan using descriptionprodftdiclnk_pkey on
> descriptionprodftdiclnk  (cost=0.00..36720.39 rows=44388 width=8)
> (actual time=0.205..73.489 rows=43792 loops=1)
>    Index Cond: (idword = 44)
>  Total runtime: 100.564 ms
> (3 rows)

> create index ix_tempIndex on descriptionprodftdiclnk(idword);
> CREATE INDEX
> explain analyze select * from descriptionprodftdiclnk where idword=44;
>                                                                    QUERY
> PLAN
> ---------------------------------------------------------------------------
>---------------------------------------------------------------------- Index
> Scan using ix_tempindex on descriptionprodftdiclnk
> (cost=0.00..916.24 rows=44388 width=8) (actual time=0.021..79.879
> rows=43792 loops=1)
>    Index Cond: (idword = 44)
>  Total runtime: 107.081 ms
> (3 rows)
>
> Could someone provide an explanation for the planner's behaviour?

Pretty simple, really.  Look at the cost calculations for the index scan for
the multi-column index.    PostgreSQL believes that:
The cost of a seq scan is 4788.14
The cost of an 2-column index scan is 36720.39
The cost of a 1-column index scan is 916.24

Assuming that you ran each of these queries multiple times to eliminate
caching as a factor, the issue is that the cost calculations are wrong.   We
give you a number of GUC variables to change that:
effective_cache_size
random_page_cost
cpu_tuple_cost
etc.

See the RUNTIME-CONFIGURATION docs for more details.

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: multi-column index

From

David Brown

Date:

17 March 2005, 03:26:10

Whoa Josh! I don't believe you're going to reduce the cost by 10 times
through a bit of tweaking - not without lowering the sequential scan
cost as well.

The only thing I can think of is perhaps his primary index drastically
needs repacking. Otherwise, isn't there a real anomaly here? Halving the
key width might account for some of it, but it's still miles out of court.

Actually, I'm surprised the planner came up with such a low cost for the
single column index, unless ... perhaps correlation statistics aren't
used when determining costs for multi-column indexes?

Josh Berkus wrote:

>Pretty simple, really.  Look at the cost calculations for the index scan for
>the multi-column index.    PostgreSQL believes that:
>The cost of a seq scan is 4788.14
>The cost of an 2-column index scan is 36720.39
>The cost of a 1-column index scan is 916.24
>
>Assuming that you ran each of these queries multiple times to eliminate
>caching as a factor, the issue is that the cost calculations are wrong.   We
>give you a number of GUC variables to change that:
>effective_cache_size
>random_page_cost
>cpu_tuple_cost
>etc.
>
>See the RUNTIME-CONFIGURATION docs for more details.
>

Re: multi-column index

From

Tom Lane

Date:

17 March 2005, 06:19:17

David Brown <time@bigpond.net.au> writes:
> Actually, I'm surprised the planner came up with such a low cost for the
> single column index, unless ... perhaps correlation statistics aren't
> used when determining costs for multi-column indexes?

The correlation calculation for multi-column indexes is pretty whacked
out pre-8.0.  I don't think it's that great in 8.0 either --- we really
need to make ANALYZE calculate the correlation explicitly for each
index, probably, rather than trying to use per-column correlations.

            regards, tom lane

Re: multi-column index

From

Manfred Koizar

Date:

17 March 2005, 11:48:48

On Wed, 16 Mar 2005 22:19:13 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>calculate the correlation explicitly for each index

May be it's time to revisit an old proposal that has failed to catch
anybody's attention during the 7.4 beta period:
http://archives.postgresql.org/pgsql-hackers/2003-08/msg00937.php

I'm not sure I'd store index correlation in a separate table today.
You've invented something better for functional index statistics, AFAIR.

Servus
 Manfred

Re: multi-column index

From

Christopher Kings-Lynne

Date:

17 March 2005, 11:55:27

> May be it's time to revisit an old proposal that has failed to catch
> anybody's attention during the 7.4 beta period:
> http://archives.postgresql.org/pgsql-hackers/2003-08/msg00937.php
>
> I'm not sure I'd store index correlation in a separate table today.
> You've invented something better for functional index statistics, AFAIR.

Make it deal with cross-table fk correlations as well :)

Chris

Re: multi-column index

From

Manfred Koizar

Date:

17 March 2005, 14:15:20

On Thu, 17 Mar 2005 16:55:15 +0800, Christopher Kings-Lynne
<chriskl@familyhealth.com.au> wrote:
>Make it deal with cross-table fk correlations as well :)

That's a different story.  I guess it boils down to cross-column
statistics for a single table.  Part of this is the correlation between
values in two or more columns, which is not the same as the correlation
between column (or index tuple) values and tuple positions.

And yes, I did notice the smiley ;-)

Servus
 Manfred

Re: multi-column index

From

Tom Lane

Date:

17 March 2005, 21:15:49

Manfred Koizar <mkoi-pg@aon.at> writes:
> On Wed, 16 Mar 2005 22:19:13 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> calculate the correlation explicitly for each index

> May be it's time to revisit an old proposal that has failed to catch
> anybody's attention during the 7.4 beta period:
> http://archives.postgresql.org/pgsql-hackers/2003-08/msg00937.php

> I'm not sure I'd store index correlation in a separate table today.
> You've invented something better for functional index statistics, AFAIR.

Well, the original motivation for calculating correlations on columns
was that historically, you didn't need to re-ANALYZE after creating an
index: the stats on the base table were already in place.  So the idea
was to have the correlations already available whether or not the index
existed.  This works fine for plain indexes on single columns ;-).  We
didn't realize (or at least I didn't) how poorly the per-column stats
apply to multi-column indexes.

I am coming around to the view that we really do need to calculate
index-specific correlation numbers, and that probably does need a
special table ... or maybe better, add a column to pg_index.  The column
in pg_statistic is useless and should be removed, because there isn't
any need for per-column correlation.

Now, as to the actual mechanics of getting the numbers: the above link
seems to imply reading the whole index in index order.  Which is a
hugely expensive proposition for a big index, especially one that's
grown rather than been built recently --- the physical and logical
orderings of the index will be different.  (Hm, maybe we need a stat
about the extent of disorder within the index itself?)  We need a way
to get the number from a small sample of pages.

The idea I was toying with was to recalculate the index keys for the
sample rows that ANALYZE already acquires, and then compare/sort
those.  This is moderately expensive CPU-wise though, and it's also not
clear what "compare/sort" means for non-btree indexes.

If we could get a correlation estimate by probing only a small fraction
of the index pages, that would work, but in a disordered index I'm not
sure how you figure out what you're looking at.

            regards, tom lane

Re: multi-column index

From

Manfred Koizar

Date:

18 March 2005, 13:31:56

On Thu, 17 Mar 2005 23:48:30 -0800, Ron Mayer
<rm_pg@cheapcomplexdevices.com> wrote:
>Would this also help estimates in the case where values in a table
>are tightly clustered, though not in strictly ascending or descending
>order?

No, I was just expanding the existing notion of correlation from single
columns to index tuples.

>For example, address data has many fields that are related
>to each other (postal codes, cities, states/provinces).

This looks like a case for cross-column statistics, though you might not
have meant it as such.  I guess what you're talking about can also be
described with a single column.  In a list like

  3 3 ... 3 1 1 ... 1 7 7 ... 7 4 4 ... 4 ...

equal items are "clustered" together but the values are not "correlated"
to their positions.  This would require a whole new column
characteristic, something like the probability that we find the same
value in adjacent heap tuples, or the number of different values we can
expect on one heap page.  The latter might even be easy to compute
during ANALYSE.

Servus
 Manfred

Re: multi-column index

From

Manfred Koizar

Date:

18 March 2005, 13:40:07

On Thu, 17 Mar 2005 13:15:32 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>I am coming around to the view that we really do need to calculate
>index-specific correlation numbers,

Correlation is a first step.  We might also want distribution
information like number of distinct index tuples and histograms.

>Now, as to the actual mechanics of getting the numbers: the above link
>seems to imply reading the whole index in index order.

That turned out to be surprisingly easy (no need to look at data values,
no operator lookup, etc.) to implement as a proof of concept.  As it's
good enough for my use cases I never bothered to change it.

>  Which is a
>hugely expensive proposition for a big index,

Just a thought:  Could the gathering of the sample be integrated into
the bulk delete phase of VACUUM?  (I know, ANALYSE is not always
performed as an option to VACUUM, and VACUUM might not even have to
delete any index tuples.)

>  We need a way
>to get the number from a small sample of pages.

I had better (or at least different) ideas at that time, like walking
down the tree, but somehow lost impetus :-(

>The idea I was toying with was to recalculate the index keys for the
>sample rows that ANALYZE already acquires, and then compare/sort
>those.

This seems to be the approach that perfectly fits into what we have now.

>  This is moderately expensive CPU-wise though, and it's also not
>clear what "compare/sort" means for non-btree indexes.

Nothing.  We'd need some notion of "clusteredness" instead of
correlation.  C.f. my answer to Ron in this thread.

BTW, the more I think about it, the more I come to the conclusion that
when the planner starts to account for "clusteredness", random page cost
has to be raised.

Servus
 Manfred