Thread: How index are running and how to optimise ?
Hi, I have may be a stupid question, but I'm a little surprised with some explains I have, using date fields ... I would like to understand exactly when index are used ... I'm using PostgresQL 7.4.1 I have a table with 351 000 records. I have about 300 to 600 new records by day I have an index like this : ix_contracts_start_stop_date btree (start_date, stop_date) I want to simply do something like this : select o.id_contract from contracts o where o.start_date <= '2001-10-31' and (o.stop_date > '2001-11-06' or stop_date is null); OK I get an explain like this : QUERY PLAN -------------------------------------------------------------------------------------------------------------- Seq Scan on contracts o (cost=0.00..12021.80 rows=160823 width=4) Filter: ((start_date <= '2001-10-31'::date) AND ((stop_date > '2001-11-06'::date) OR (stop_date IS NULL))) I understand that the OR could make the no use of the stop_date index ..., but why I'm not using the index for the start_date part ? Index are used only if I use an egality like this : select o.id_contract from contracts o where o.start_date = '2001-10-31' and o.stop_date = '2001-11-06'; QUERY PLAN ------------------------------------------------------------------------------------------------ Index Scan using ix_contracts_start_stop_date on contracts o (cost=0.00..6.00 rows=1 width=4) Index Cond: ((start_date = '2001-10-31'::date) AND (stop_date = '2001-11-06'::date)) Could you please explain me why index are not used with <, > and how I can optimise my request ... I have no idea but I'm using this request to do insert in another table and this segmentation take 13 hours for making the insert ! :o(( Thanks for help, -- Hervé Piedvache Elma Ingénierie Informatique 6 rue du Faubourg Saint-Honoré F-75008 - Paris - France Pho. 33-144949901 Fax. 33-144949902
Hervé Piedvache wrote: >Hi, > >I have may be a stupid question, but I'm a little surprised with some explains >I have, using date fields ... > >I would like to understand exactly when index are used ... >I'm using PostgresQL 7.4.1 > >I have a table with 351 000 records. >I have about 300 to 600 new records by day >I have an index like this : >ix_contracts_start_stop_date btree (start_date, stop_date) > >I want to simply do something like this : > >select o.id_contract > from contracts o > where o.start_date <= '2001-10-31' > and (o.stop_date > '2001-11-06' or stop_date is null); > >OK I get an explain like this : > QUERY PLAN >-------------------------------------------------------------------------------------------------------------- > Seq Scan on contracts o (cost=0.00..12021.80 rows=160823 width=4) > Filter: ((start_date <= '2001-10-31'::date) AND ((stop_date > >'2001-11-06'::date) OR (stop_date IS NULL))) > >I understand that the OR could make the no use of the stop_date index ..., but >why I'm not using the index for the start_date part ? > >Index are used only if I use an egality like this : > >select o.id_contract > from contracts o > where o.start_date = '2001-10-31' > and o.stop_date = '2001-11-06'; > > QUERY PLAN >------------------------------------------------------------------------------------------------ > Index Scan using ix_contracts_start_stop_date on contracts o >(cost=0.00..6.00 rows=1 width=4) > Index Cond: ((start_date = '2001-10-31'::date) AND (stop_date = >'2001-11-06'::date)) > >Could you please explain me why index are not used with <, > and how I can >optimise my request ... I have no idea but I'm using this request to do >insert in another table and this segmentation take 13 hours for making the >insert ! :o(( > >Thanks for help, > > Have you ANALYZEd recently? If not you need to do that regularly. Try VACUUM ANALYZE contracts; to vacuum that specific table. Could you also try select o.id_contract from contracts o where o.start_date NOT BETWEEN '2001-10-31' AND '2001-11-06' OR o.stop_date IS NULL; Also could you paste the results of EXPLAIN ANALYZE instead of EXPLAIN. Cheers Nick
On Wed, 3 Mar 2004, [iso-8859-15] Hervé Piedvache wrote: > Hi, > > I have may be a stupid question, but I'm a little surprised with some explains > I have, using date fields ... > > I would like to understand exactly when index are used ... > I'm using PostgresQL 7.4.1 > > I have a table with 351 000 records. > I have about 300 to 600 new records by day > I have an index like this : > ix_contracts_start_stop_date btree (start_date, stop_date) > > I want to simply do something like this : > > select o.id_contract > from contracts o > where o.start_date <= '2001-10-31' > and (o.stop_date > '2001-11-06' or stop_date is null); > > OK I get an explain like this : > QUERY PLAN > -------------------------------------------------------------------------------------------------------------- > Seq Scan on contracts o (cost=0.00..12021.80 rows=160823 width=4) > Filter: ((start_date <= '2001-10-31'::date) AND ((stop_date > > '2001-11-06'::date) OR (stop_date IS NULL))) Notice the planner is expecting to get back 160823 rows here. How many does it actually return? > I understand that the OR could make the no use of the stop_date index ..., but > why I'm not using the index for the start_date part ? > > Index are used only if I use an egality like this : > > select o.id_contract > from contracts o > where o.start_date = '2001-10-31' > and o.stop_date = '2001-11-06'; No, you don't have to do that. You should be able to use a range and get an index scan IF said index scan will be faster (in the query planner's estimate.) explain select * from test where dt>'2004-01-01 00:00:00' and dt<'2004-01-02 00:00:00'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Index Scan using test_dt on test (cost=0.00..118628.84 rows=29793 width=51) Index Cond: ((dt > '2004-01-01 00:00:00'::timestamp without time zone) AND (dt < '2004-01-02 00:00:00'::timestamp without time zone)) (2 rows) Notice the use of an index there. > QUERY PLAN > ------------------------------------------------------------------------------------------------ > Index Scan using ix_contracts_start_stop_date on contracts o > (cost=0.00..6.00 rows=1 width=4) > Index Cond: ((start_date = '2001-10-31'::date) AND (stop_date = > '2001-11-06'::date)) Here the planner expects ONE row. Of course it's using an index. > Could you please explain me why index are not used with <, > and how I can > optimise my request ... I have no idea but I'm using this request to do > insert in another table and this segmentation take 13 hours for making the > insert ! :o(( It may well be the inserts that are slow and not the selects. how long does the select, by itself, take to run?