Re: How index are running and how to optimise ? - Mailing list pgsql-general

From scott.marlowe
Subject Re: How index are running and how to optimise ?
Date
Msg-id Pine.LNX.4.33.0403031148480.8436-100000@css120.ihs.com
Whole thread Raw
In response to How index are running and how to optimise ?  (Hervé Piedvache <herve@elma.fr>)
List pgsql-general
On Wed, 3 Mar 2004, [iso-8859-15] Hervé Piedvache wrote:

> Hi,
>
> I have may be a stupid question, but I'm a little surprised with some explains
> I have, using date fields ...
>
> I would like to understand exactly when index are used ...
> I'm using PostgresQL 7.4.1
>
> I have a table with 351 000 records.
> I have about 300 to 600 new records by day
> I have an index like this :
> ix_contracts_start_stop_date btree (start_date, stop_date)
>
> I want to simply do something like this :
>
> select o.id_contract
>    from contracts o
>  where o.start_date <= '2001-10-31'
>     and (o.stop_date > '2001-11-06' or stop_date is null);
>
> OK I get an explain like this :
>                                                   QUERY PLAN
> --------------------------------------------------------------------------------------------------------------
>  Seq Scan on contracts o  (cost=0.00..12021.80 rows=160823 width=4)
>    Filter: ((start_date <= '2001-10-31'::date) AND ((stop_date >
> '2001-11-06'::date) OR (stop_date IS NULL)))

Notice the planner is expecting to get back 160823 rows here.  How many
does it actually return?

> I understand that the OR could make the no use of the stop_date index ..., but
> why I'm not using the index for the start_date part ?
>
> Index are used only if I use an egality like this :
>
> select o.id_contract
>    from contracts o
>  where o.start_date = '2001-10-31'
>     and o.stop_date = '2001-11-06';

No, you don't have to do that.  You should be able to use a range and get
an index scan IF said index scan will be faster (in the query planner's
estimate.)

explain select * from test where dt>'2004-01-01 00:00:00' and
dt<'2004-01-02 00:00:00';
                                                               QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using test_dt on test  (cost=0.00..118628.84 rows=29793
width=51)
   Index Cond: ((dt > '2004-01-01 00:00:00'::timestamp without time zone)
AND (dt < '2004-01-02 00:00:00'::timestamp without time zone))
(2 rows)

Notice the use of an index there.

>                                           QUERY PLAN
> ------------------------------------------------------------------------------------------------
>  Index Scan using ix_contracts_start_stop_date on contracts o
> (cost=0.00..6.00 rows=1 width=4)
>    Index Cond: ((start_date = '2001-10-31'::date) AND (stop_date =
> '2001-11-06'::date))

Here the planner expects ONE row.  Of course it's using an index.

> Could you please explain me why index are not used with <, > and how I can
> optimise my request ... I have no idea but I'm using this request to do
> insert in another table and this segmentation take 13 hours for making the
> insert ! :o((

It may well be the inserts that are slow and not the selects.  how long
does the select, by itself, take to run?


pgsql-general by date:

Previous
From: Jan Wieck
Date:
Subject: Slony-I makes progress
Next
From: "scott.marlowe"
Date:
Subject: Re: Moving from MySQL to PGSQL....some questions (multilevel