Re: 500x speed-down: Wrong query plan? - Mailing list pgsql-performance

From Alessandro Baretta
Subject Re: 500x speed-down: Wrong query plan?
Date
Msg-id 43C28DEA.60801@barettadeit.com
Whole thread Raw
In response to Re: 500x speed-down: Wrong query plan?  (Matteo Beccati <php@beccati.com>)
Responses Re: 500x speed-down: Wrong query plan?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Matteo Beccati wrote:
> Hi Alessandro,
>
>>  Nested Loop  (cost=0.00..1017.15 rows=1 width=1146) (actual
>> time=258.648..258.648 rows=0 loops=1)
>>    ->  Seq Scan on ubicazione  (cost=0.00..1011.45 rows=1 width=536)
>> (actual time=0.065..51.617 rows=12036 loops=1)
>>          Filter: ((id_ente = 'dmd'::text) AND (allarme IS NULL) AND
>> (manutenzione IS NULL))
>
>
> The problem seems here. The planner expects one matching row (and that's
> why it chooses a nested loop), but 12036 rows are matching this condition.
>
> Are you sure that you recentrly ANALYZED the table "ubicazione"? If so,
> try to increase statistics for the id_ente column.

No, this is not the problem. I increased the amount of statistics with ALTER
TABLE ... SET STATISTICS 1000, which is as much as I can have. The problem is
that the planner simply ignores the right query plan, which is orders of
magnitude less costly. Keep in mind that the XDBS--the CASE tool I use--makes
heavy use of indexes, and generates all relevant indexes in relation to the join
paths which are implicit in the ER model "relations". In this case, both
ubicazione and articolo have indexes on the join fields:

Indexes:
"articolo_pkey" primary key, btree (id_ente, id_produttore, id_articolo)
"ubicazione_fkey_articolo" btree (id_ente, id_produttore, id_articolo)

Notice that only the "articolo_pkey" is a unique index, while
"ubicazione_fkey_articolo" allows duplicates. This second index is not used by
the planner.

Both tables also have a "bookkeeping" index on xdbs_modified. I am selecting
"recently inserted or updated" tuples, which are usually a very small fraction
of the table--if there are any. The index on xdbs_modified is B-tree allowing a
very quick index scan to find the few tuples having xdbs_modified > '[some
recent timestamp]'. Hence, the optimal plan for both queries is to perform an
index scan using the <table_name>_modified_index on the table upon which I
specify the xdbs_modified > '...' condition, and the join-fields index on the
other table.

Alex

--
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>

pgsql-performance by date:

Previous
From: Szűcs Gábor
Date:
Subject: Re: Avoiding cartesian product
Next
From: peter royal
Date:
Subject: Re: help tuning queries on large database