Thread: Query optimisation

Query optimisation

From
Michel Vrand
Date:
Hi!
I first apologize for my poor english.

We are working with linux-apache-postgresql-php3.
One among our queries is structured as follow :

$result = pg_exec ($conn, "SELECT produits.id_produit, produits.id_proprio,
articles.categ, groupes.nom, produits.pvttc_prod, articles.titre,
articles.type_1, articles.obs_art
FROM produits, articles, groupes
WHERE $conditions  $relations;");

$conditions may be

1/ ...AND groupes.nom = '$something' AND....
or
2/ ...AND groupes.nom ~* '$something' AND....

In the first case, the query is reasonnably fast (0.30 s for 4 items on
15000)
In the second case, the query becomes very slow (more than 31 s for the same
result)

to give example, in the first case $something = "Beatles"                  in the second case $something = "beatl"

How to optimise speed ? I tried to type EXPLAIN but I do not understand the
result :

For 1/
Nested loop (cost=0.00 size=1 width=86)
-> Merge Join (cost=0.00 size=1 width=72)   -> Seq Scan (cost=0.00 size=0 width=0)       -> Sort (cost=0.00 size=0
width=0)          -> Seq Scan on produits (cost=0.00 size=0 width=16)   -> Seq Scan (cost=0.00 size=0 width=0)       ->
Sort(cost=0.00 size=0 width=0)           -> Index Scan using type_1.idx on articles (cost=0.00 size=0
 
width=56)
-> Index Scan using nom_groupe.idx on groupes (cost=0.00 size=0 width=16)

For 2/
Nested loop (cost=0.00 size=1 width=86)
-> Merge Join (cost=0.00 size=1 width=72)   -> Seq Scan (cost=0.00 size=0 width=0)       -> Sort (cost=0.00 size=0
width=0)          -> Seq Scan on produits (cost=0.00 size=0 width=16)   -> Seq Scan (cost=0.00 size=0 width=0)       ->
Sort(cost=0.00 size=0 width=0)           -> Index Scan using type_1.idx on articles (cost=0.00 size=0
 
width=56)
-> Seq Scan on groupes (cost=0.00 size=0 width=16)

The only difference seems to be the use (or not) of index in the last line.
Do you think it causes such a decrease of speed ? How to work around ?

Thanks in advance




Re: Query optimisation

From
"Richard Huxton"
Date:
From: "Michel Vrand" <mvrand@pacwan.fr>


> $conditions may be
>
> 1/ ...AND groupes.nom = '$something' AND....
> or
> 2/ ...AND groupes.nom ~* '$something' AND....
>
> In the first case, the query is reasonnably fast (0.30 s for 4 items on
> 15000)
> In the second case, the query becomes very slow (more than 31 s for the
same
> result)
>
> to give example, in the first case $something = "Beatles"
>                    in the second case $something = "beatl"
>
> How to optimise speed ? I tried to type EXPLAIN but I do not understand
the
> result :
>
You are right - the index *does* make that much difference. The problem is
that ~* matches *anywhere* in the string so it can't use the index - it has
to read each entry.

If you want to match the start of the entry, I know of one way to make the
index work. Use

groupes.nom>='beatl' AND groupes.nom<='beatlz'

And that should work - although 'z' is a poor character to use - pick the
highest valid character in your character set. You might also want to look
in the archives for the thread on pgsql-general with a subject of 'Simple
Question: Case sensitivity'

- Richard Huxton