Re: Optimizing further - Mailing list pgsql-general
From | Curt Sampson |
---|---|
Subject | Re: Optimizing further |
Date | |
Msg-id | Pine.NEB.4.43.0206141451280.664-100000@angelic.cynic.net Whole thread Raw |
In response to | Optimizing further (Ken Williams <ken@mathforum.org>) |
List | pgsql-general |
On Fri, 14 Jun 2002, Ken Williams wrote: > select price, date FROM foo > WHERE date < '2000-07-01 00:29:00+10' and code='UCL' > ORDER by date DESC, stock DESC > LIMIT 100; > > I've got over 7 million rows in the table, and a btree > index(date, code). According to "explain", the query will be > executed as follows: > > Limit (cost=0.00..393.45 rows=100 width=32) > -> Index Scan Backward using foo_date_code on trades > (cost=0.00..12309.13 rows=3128 width=32) > > When I run the query, it takes a really long time (10 seconds) > the first time I execute it, and then returns instantly the > second time. My first suspicion is that, as someone else explained, the data is being dragged (very slowly) off the disk with the first query, and then is being read from the cache in the second query. So it's disk I/O that's your problem. You can do a "set show_query_stats = true", run the query, and check the log you'll find out how many I/O operations you're doing. So let's see how we might process this query. (Experts should correct me if I go wrong, here.) First we find the last occurance of the given date in the index, thus doing perhaps 2-4 reads (depending on the depth of the btree). Then we scan backwards from that point, looking at the code value in the index, and collecting up the ones that have a code of UCL. This should be pretty quick as, even though we probably are not using OS read-ahead, the blocks are going to be darn close together and the index entries are all clustered together in a few blocks. They estimate 3128 rows read, so let's say it's 300 or so blocks we read. (It's probably less.) Once we've got a hundred of these babies, we have to go and fetch their tuples, which could be up to 100 reads, if they're scattered randomly about the table. Then we sort the whole lot, and we're done. So one would think you'd see only about 400 blocks read for the query, 300 of which are very close together. On modern disks this should only take a couple of seconds. You don't happen to have this on a really old, slow disk, do you? I suspect not. Maybe it's reading a lot more than 400 blocks? Perhaps the optimizer doesn't realize it can get the value of code from the index, and is instead reading the tuple for every item in the index it scans. If that's the case, maybe reversing the two values in your index (foo_code_date instead of foo_date_code) would help? cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
pgsql-general by date: