Re: About b-tree usage - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: About b-tree usage |
Date | |
Msg-id | 1110360219.4089.483.camel@jeff Whole thread Raw |
In response to | Re: About b-tree usage (Ioannis Theoharis <theohari@ics.forth.gr>) |
List | pgsql-hackers |
On Tue, 2005-03-08 at 15:30 +0200, Ioannis Theoharis wrote: > > let me, i have turned enable_seqscan to off, in order to discourage > optimizer to choose seq_scan whenever an idex_scan can be used. > > But in this case, why optimizer don't chooses seq_scan (discourage is > different than prevent) ? > As Michael Paesold pointed out, enable_seqscan=off effectively means that it will never choose a seqscan when there exists an alternative. > At many cases i need only a small fragment of raws to be retrieved. But > this extreme case is a real-scenario (not the most frequent but real). > Returning all the rows in a large table is a real scenario? I would expect you to at least use a cursor. Keep in mind that libpq has to store all those rows in local client memory, so a cursor is a good idea if that is the case. And regardless, if you are returning all the rows in a table, the absolute fastest way possible is a seq scan. An index is much slower because it does way more work, and the disk accesses are random rather than sequential. > I try to find a way to achieve good performence even for the extreme > case. Is there any way? > The extreme case you provided is a SELECT without a WHERE. We already know that PostgreSQL is executing that as efficiently as possible when enable_seqscan=on. Why not send me a simple script that generates some data similar to what you want to access, and then the query you'd like to perform on that data. We might find that PostgreSQL is already executing the query as efficiently as possible; in fact I suspect that's the case (although perhaps some tuning would help). In short, we shouldn't try to use indexes for all situations; they are inherently worse for some situations. That's why PostgreSQL has both seqscans and indexes, and an intelligent planner. > ps. In bibliografy, there is a different alternative for indices. except > th simple approach of <attr_val, rid> is the alternative <attr_val, set > of rids>. The second means the attaches to each discrete attr_val the set > o rid's of all raws with same attr_val. Is this alternative taken into > account in postgres? > I don't entirely understand what you're asking. It seems like you're talking about a relatively minor implementation issue, and if it does make a difference, it would probably not be very much. Perhaps rephrase your question? > > On Mon, 7 Mar 2005, Jeff Davis wrote: > > > > > In that case, sequential scan is faster, but perhaps the planner doesn't > > know that ahead of time. Try turning on more statistics if you haven't > > already, and then run ANALYZE again. If the planner sees a range, > > perhaps it assumes that it is a highly selective range, when in fact, it > > consists of all of the tuples. Also, make sure enable_seqscan is true > > (in case you turned it off for testing or something and forgot). > > > > A seqscan is usually faster when a large proportion of the tuples are > > returned because: > > (1) It uses sequential I/O; whereas an index might access tuples in a > > random order. > > (2) It doesn't have to read the index's disk pages at all. > > > > I suspect you don't need to return all the tuples in the table. If you > > include the details of a real scenario perhaps the people on the list > > could be more helpful. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
pgsql-hackers by date: