Home > mailing lists

Re: Seq scan vs. Index scan with different query - Mailing list pgsql-performance

From	Andrew McMillan
Subject	Re: Seq scan vs. Index scan with different query
Date	July 5, 2004 08:44:25
Msg-id	1089027854.6664.197.camel@lamb.mcmillan.net.nz Whole thread Raw
In response to	Seq scan vs. Index scan with different query conditions (eleven@ludojad.itpp.pl)
Responses	Re: Seq scan vs. Index scan with different query conditions
List	pgsql-performance

Tree view

On Mon, 2004-07-05 at 12:15 +0200, eleven@ludojad.itpp.pl wrote:
> Hello,
>
> Can anybody suggest any hint on this:
>
> temp=> EXPLAIN SELECT DISTINCT "number" FROM "tablex" WHERE "Date" BETWEEN '2004-06-28'::date AND '2004-07-04'::date
AND"Time" BETWEEN '00:00:00'::time AND '18:01:00'::time; 
>
> Unique  (cost=305669.92..306119.43 rows=89 width=8)
>     ->  Sort  (cost=305669.92..305894.67 rows=89903 width=8)
>         Sort Key: "number"
>             ->  Index Scan using "DateTimeIndex" on "tablex"  (cost=0.00..298272.66 rows=89903 width=8)
>                 Index Cond: (("Date" >= '2004-06-28'::date) AND ("Date" <= '2004-07-04'::date) AND ("Time" >=
'00:00:00'::timewithout time zone) AND ("Time" <= '18:01:00'::time without time zone)) 
>
>
> temp=> EXPLAIN SELECT DISTINCT "number" FROM "tablex" WHERE "Date" BETWEEN '2004-06-28'::date AND '2004-07-04'::date
AND"Time" BETWEEN '00:00:00'::time AND '19:01:00'::time; 
>
> Unique  (cost=315252.77..315742.27 rows=97 width=8)
>     ->  Sort  (cost=315252.77..315497.52 rows=97900 width=8)
>         Sort Key: "number"
>             ->  Seq Scan on "tablex"  (cost=0.00..307137.34 rows=97900 width=8)
>             Filter: (("Date" >= '2004-06-28'::date) AND ("Date" <= '2004-07-04'::date) AND ("Time" >=
'00:00:00'::timewithout time zone) AND ("Time" <= '19:01:00'::time without time zone)) 
>
> Basically, the difference is in upper "Time" value (as you can see, it's
> 18:01:00 in the first query and 19:01:00 in the other one).
> The question is - why does it use index in first case and
> it tries to do full sequential scan when the upper "Time" value
> is different?
>
> DateTimeIndex was created on both columns (Date/Time):
> CREATE INDEX "DateTimeIndex" ON "tablex" USING btree ("Date", "Time");

PostgreSQL is always going to switch at some point, where the number of
rows that have to be read from the table exceed some percentage of the
total rows in the table.

We can possibly be more helpful if you send EXPLAIN ANALYZE, rather than
just EXPLAIN.

A few things to be careful of:

- Is this supposed to be a slice of midnight to 6pm, for each day
between 28 June and 4 July?  If you want a continuous period from
Midnight 28 June -> 6pm 4 July you're better to have a single timestamp
field.

- It is unlikely that the , "Time" on your index is adding much to your
selectivity, and it may be that you would be better off without it.

- the DISTINCT can screw up your results, and it usually means that the
SQL is not really the best it could be.  A _real_ need for DISTINCT is
quite rare in my experience, and from what I have seen it adds overhead
and tends to encourage bad query plans when used unnecessarily.

Hope this is some help.

Regards,
                    Andrew McMillan

-------------------------------------------------------------------------
Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/            PHYS: Level 2, 150-154 Willis St
DDI: +64(4)803-2201      MOB: +64(272)DEBIAN      OFFICE: +64(4)499-2267
      Make things as simple as possible, but no simpler -- Einstein
-------------------------------------------------------------------------

Attachment

signature.asc

pgsql-performance by date:

From: Richard Huxton
Date: 05 July 2004, 08:41:25
Subject: Re: Seq scan vs. Index scan with different query conditions

From: eleven@ludojad.itpp.pl
Date: 05 July 2004, 10:42:22
Subject: Re: Seq scan vs. Index scan with different query conditions

Re: Seq scan vs. Index scan with different query - Mailing list pgsql-performance

Attachment

Previous

Next