Thread: Problem with indexes

Problem with indexes

From
Guillaume Lémery
Date:
Hi !

I have a big problem using Indexes which slows my database.

I have a table with lot of data (> 500 000):

CREATE TABLE accord_editeur
(
id_regie int8 not null,
num_campagne int8 not null,
num_publicite int8 not null,
num_editeur int8 not null,
num_site int8 not null,
num_emplacement int8 not null,
num_periode int8,
par_id_technologie int8 not null,
affichage_possible int2 default 0,
ponderation_calculee int8,
date_pilotage timestamp NULL,
id_ct1 int8,
PRIMARY KEY(id_regie,num_campagne,num_publicite
,num_editeur,num_site,num_emplacement)
);

And an Index :

CREATE INDEX ae_tracking_idx ON accord_editeur(id_regie, num_editeur,
num_site, num_emplacement);

If I do an EXPLAIN on this  :

SELECT ae.id_regie,
ae.num_campagne,
ae.num_publicite,
ae.ponderation_calculee,
ae.num_periode
FROM accord_editeur ae
WHERE ae.id_regie = 1
AND ae.num_editeur = 1494
AND ae.num_site = 1
AND ae.num_emplacement = 1
AND ae.affichage_possible = 1

I get :

Seq Scan on accord_editeur ae  (cost=0.00..19349.71 rows=1 width=40)


but I expected an Index Scan, what's the matter with this Index ??

Thanx

Guillaume Lémery


Re: Problem with indexes

From
pov@club-internet.fr (Yann Coupin)
Date:
hi,

glemery@comclick.com (Guillaume L�mery) wrote in <3A65C7E4.3020202
@comclick.com>:
[...]
>And an Index :
>
>CREATE INDEX ae_tracking_idx ON accord_editeur(id_regie, num_editeur,
>num_site, num_emplacement);

In this Index creation you specified that *one* index will index *four*
rows, and to do that, postgres can only use btree index. This type of index
is usefull when you search a range of value with those operators : < > <= or
>=

>If I do an EXPLAIN on this  :
>
>SELECT ae.id_regie,
>ae.num_campagne,
>ae.num_publicite,
>ae.ponderation_calculee,
>ae.num_periode
>FROM accord_editeur ae
>WHERE ae.id_regie = 1
>AND ae.num_editeur = 1494
>AND ae.num_site = 1
>AND ae.num_emplacement = 1
>AND ae.affichage_possible = 1

But that's not at all what you do, you search values that match exactly to
one value, and to achieve this result you have to create four individuals
index of type 'hash' using this list of commands :

CREATE INDEX ae_ir_idx ON accord_editeur USING hash (id_regie);
CREATE INDEX ae_ned_idx ON accord_editeur USING hash (num_editeur);
CREATE INDEX ae_ns_idx ON accord_editeur USING hash (num_site);
CREATE INDEX ae_nem_idx ON accord_editeur USING hash (num_emplacement);

But even in this condition index aren't always the most efficient way to
achieve the result especially if you have many rows with the same values. In
this case it's more efficient to do a seq scan.

Yann

Re: Problem with indexes

From
Tom Lane
Date:
Guillaume =?ISO-8859-1?Q?L=E9mery?= <glemery@comclick.com> writes:
> CREATE TABLE accord_editeur
> (
> id_regie int8 not null,
> num_campagne int8 not null,
> num_publicite int8 not null,
> num_editeur int8 not null,
> num_site int8 not null,
> num_emplacement int8 not null,
> num_periode int8,
> ...

> CREATE INDEX ae_tracking_idx ON accord_editeur(id_regie, num_editeur,
> num_site, num_emplacement);

> If I do an EXPLAIN on this  :

> SELECT ae.id_regie,
> ae.num_campagne,
> ae.num_publicite,
> ae.ponderation_calculee,
> ae.num_periode
> FROM accord_editeur ae
> WHERE ae.id_regie = 1
> AND ae.num_editeur = 1494
> AND ae.num_site = 1
> AND ae.num_emplacement = 1
> AND ae.affichage_possible = 1

> I get :

> Seq Scan on accord_editeur ae  (cost=0.00..19349.71 rows=1 width=40)

The problem is that the system is not very smart about converting
cross-datatype comparisons into indexscans, and what you have written
is comparisons between int8 fields and int4 constants.  If you write
the query as

    WHERE ae.id_regie = 1::int8
    AND ae.num_editeur = 1494::int8
    AND ae.num_site = 1::int8
    AND ae.num_emplacement = 1::int8
    ...

then you will get an indexscan.  My advice, however, would be to think
carefully about whether you really *need* int8 fields, or could save
space and notational hassle by using int4 fields instead.  (You should
also think twice about whether a four-component index really makes
sense, but that's a different discussion.)

At some point we will figure out how to get the system to assign types
to constants more intelligently --- this same issue causes problems for
people who write "numericfield = 12.34", for example, since 12.34 is
taken as a float8 constant by default.  It's not easy to do that without
doing severe damage to the notion of datatype extensibility, however :-(

            regards, tom lane