Re: Seq scan vs. Index scan with different query - Mailing list pgsql-performance
From | Andrew McMillan |
---|---|
Subject | Re: Seq scan vs. Index scan with different query |
Date | |
Msg-id | 1089055325.6664.220.camel@lamb.mcmillan.net.nz Whole thread Raw |
In response to | Re: Seq scan vs. Index scan with different query conditions (eleven@ludojad.itpp.pl) |
List | pgsql-performance |
On Mon, 2004-07-05 at 15:46 +0200, eleven@ludojad.itpp.pl wrote: > On Mon, Jul 05, 2004 at 11:44:13PM +1200, Andrew McMillan wrote: > > > > DateTimeIndex was created on both columns (Date/Time): > > > CREATE INDEX "DateTimeIndex" ON "tablex" USING btree ("Date", "Time"); > > PostgreSQL is always going to switch at some point, where the number of > > rows that have to be read from the table exceed some percentage of the > > total rows in the table. > > We can possibly be more helpful if you send EXPLAIN ANALYZE, rather than > > just EXPLAIN. > > Unfortunately that seq scan vs. index scan > heuristic was wrong - full scan kills the machine > in no time due to large amount of INSERTs happening > in the background (I/O bottleneck). In that case you could perhaps consider tweaking various parameters in your postgresql.conf - with an ideal setup the switch should happen when the costs are roughly equal. Have you gone through the information here: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html Also, if table rows are regularly DELETEd or UPDATEd then you will need to ensure it is regularly vacuumed. Does a "VACUUM VERBOSE tablex" show a large number of dead tuples? Are you running pg_autovacuum? Do you get similar results immediately after a "VACUUM FULL ANALYZE tablex"? Possibly there is an uneven distribution of rows in the table. You could consider increasing the statistics target: ALTER TABLE tablex ALTER COLUMN "Date" SET STATISTICS; ANALYZE tablex; > > - Is this supposed to be a slice of midnight to 6pm, for each day > > between 28 June and 4 July? If you want a continuous period from > > Midnight 28 June -> 6pm 4 July you're better to have a single timestamp > > field. > > - It is unlikely that the , "Time" on your index is adding much to your > > selectivity, and it may be that you would be better off without it. > > Yes, we've figured out that index on Date + Time is rather useless. > Thanks for the tip, we've created index upon Date column instead and > it should be enough. It may be that you are better with a single timestamp column with an index on it in any case, if you want the data sorted in timestamp order. Then you can ORDER BY <timestamp> as well, which will encourage the index use further (although this advantage tends to get lost with the DISTINCT). You can still access the time part for a separate comparison just with a cast. > > - the DISTINCT can screw up your results, and it usually means that the > > SQL is not really the best it could be. A _real_ need for DISTINCT is > > quite rare in my experience, and from what I have seen it adds overhead > > and tends to encourage bad query plans when used unnecessarily. > > What do you mean? The reason for which there's DISTINCT in that query is > because I want to know how many unique rows is in the table. > Do you suggest selecting all rows and doing "DISTINCT"/counting > on the application level? That's fine, I've just seen it used far too many times as a substitute for having an extra join, or an application that should only be inserting unique rows in the first place. Things like that. It's just one of those things that always sets off alarm bells when I'm reviewing someone else's work, and on most of these occasions it has not been justified when reexamined. Cheers, Andrew. ------------------------------------------------------------------------- Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St DDI: +64(4)803-2201 MOB: +64(272)DEBIAN OFFICE: +64(4)499-2267 It is truth which you cannot contradict; you can without any difficulty contradict Socrates. - Plato -------------------------------------------------------------------------
Attachment
pgsql-performance by date: