Thread: annoying query/planner choice
I have a situation that is giving me small fits, and would like to see if anyone can shed any light on it. I have a modest table (@1.4 million rows, and growing), that has a variety of queries run against it. One is a very straightforward one - pull a set of distinct rows out based on two columns, with a simple where clause based on one of the indexed columns. For illustration here, I've removed the distinct and order-by clauses, as they are not the culprits. Before I go on - v7.4.1, currently on a test box, dual P3, 1G ram, 10K scsi, Slackware 9 or so. The table has been vacuumed and analyzed. Even offered pizza and beer. Production box will be a dual Xeon with 2G ram and RAID 5. When the query is run with a where clause that returns small number of rows, the query uses the index and is quite speedy: rav=# explain analyze select casno, parameter from hai.results where site_id = 9982; QUERY PLAN ------------------------------------------------------------------------ -------------------------------------------------------------- Index Scan using hai_res_siteid_ndx on results (cost=0.00..7720.87 rows=2394 width=30) (actual time=12.118..12.933 rows=50 loops=1) Index Cond: (site_id = 9982) Total runtime: 13.145 ms When a query is run that returns a much larger set, the index is not used, I assume because the planner thinks that a sequential scan would work just as well with a large result set: rav=# explain analyze select casno, parameter from hai.results where site_id = 18; QUERY PLAN ------------------------------------------------------------------------ ---------------------------------------------- Seq Scan on results (cost=0.00..73396.39 rows=211205 width=30) (actual time=619.020..15012.807 rows=186564 loops=1) Filter: (site_id = 18) Total runtime: 15279.789 ms (3 rows) Unfortunately, its way off: rav=# set enable_seqscan=off; SET rav=# explain analyze select casno, parameter from hai.results where site_id = 18; QUERY PLAN ------------------------------------------------------------------------ ----------------------------------------------------------------------- Index Scan using hai_res_siteid_ndx on results (cost=0.00..678587.01 rows=211205 width=30) (actual time=9.575..3569.387 rows=186564 loops=1) Index Cond: (site_id = 18) Total runtime: 3872.292 ms (3 rows) I would like, of course, for it to use the index, given that it takes 20-25% of the time. Fiddling with CPU_TUPLE_COST doesn't do anything until I exceed 0.5, which strikes me as a bit high (though please correct me if I am assuming too much...). RANDOM_PAGE_COST seems to have no effect. I suppose I could cluster it, but it is constantly being added to, and would have to be re-done on a daily basis (if not more). Any suggestions? -------------------- Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com
On Sun, 11 Jan 2004, Andrew Rawnsley wrote: > 20-25% of the time. Fiddling with CPU_TUPLE_COST doesn't do anything > until I exceed 0.5, which strikes me as a bit high (though please > correct me if I am assuming too much...). RANDOM_PAGE_COST seems to have > no effect. What about the effective cache size, is that set properly? -- /Dennis Björklund
Low (1000). I'll fiddle with that. I just noticed that the machine only has 512MB of ram in it, and not 1GB. I must have raided it for some other machine... On Jan 11, 2004, at 10:50 PM, Dennis Bjorklund wrote: > On Sun, 11 Jan 2004, Andrew Rawnsley wrote: > >> 20-25% of the time. Fiddling with CPU_TUPLE_COST doesn't do anything >> until I exceed 0.5, which strikes me as a bit high (though please >> correct me if I am assuming too much...). RANDOM_PAGE_COST seems to >> have >> no effect. > > What about the effective cache size, is that set properly? > > -- > /Dennis Björklund > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to > majordomo@postgresql.org > -------------------- Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com
Centuries ago, Nostradamus foresaw when ronz@ravensfield.com (Andrew Rawnsley) would write: > I would like, of course, for it to use the index, given that it > takes 20-25% of the time. Fiddling with CPU_TUPLE_COST doesn't do > anything until I exceed 0.5, which strikes me as a bit high (though > please correct me if I am assuming too much...). RANDOM_PAGE_COST > seems to have no effect. I suppose I could cluster it, but it is > constantly being added to, and would have to be re-done on a daily > basis (if not more). > > Any suggestions? The apparent problem is a bad query plan, and for clustering to "fix" it seems a disturbing answer. A problem I saw last week with some query plans pointed to the issue that the statistics were inadequate. We had some queries where indexing on "customer" is extremely worthwhile in nearly all cases, but it often wasn't happening. The problem was that the 10 "bins" in the default stats table would collect up stats about a few _highly_ active customers, and pretty much ignore the less active ones. Because the "bins" were highly dominated by the few common values, stats for the others were missing and pretty useless. I upped the size of the histogram from 10 to 100, and that allowed stats to be kept for less active customers, GREATLY improving the quality of the queries. The point that falls out is that if you have a column which has a bunch of discrete values (rather more than 10) that aren't near-unique (e.g. - on a table with a million transactions, you have a only few hundred customers), that's a good candidate for upping column stats. Thus, you might try: ALTER TABLE MY_TABLE ALTER COLUMN SOME_COLUMN SET STATISTICS 50; ANALYZE MY_TABLE; -- let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;; http://www.ntlug.org/~cbbrowne/postgresql.html "There's no longer a boycott of Apple. But MacOS is still a proprietary OS." -- RMS - June 13, 1998
Andrew Rawnsley <ronz@ravensfield.com> writes: > I have a situation that is giving me small fits, and would like to see > if anyone can shed any light on it. In general, pulling 10% of a table *should* be faster as a seqscan than an indexscan, except under the most extreme assumptions about clustering (is the table clustered on site_id, by any chance?). What I suspect is that the table is a bit larger than your available RAM, so that a seqscan ends up flushing all of the kernel's cache and forcing a lot of I/O, whereas an indexscan avoids the cache flush by not touching (quite) all of the table. The trouble with this is that the index only looks that good under test conditions, ie, when you repeat it just after an identical query that pulled all of the needed pages into RAM. Under realistic load conditions where different site_ids are being hit, the indexscan is not going to be as good as you think, because it will incur substantial I/O. You should try setting up a realistic test load hitting different random site_ids, and see whether it's really a win to force seqscan off for this query or not. regards, tom lane
Probably my best solution is to find a better way to produce the information, or cache it on the application side, as it doesn't actually change that much across client sessions. Clustering it occurred to me - it would have to be done on a frequent basis, as the contents of the table change constantly. What I am getting out of it with this operation doesn't change much, so caching in a separate table, in the application layer, or both would probably shortcut the whole problem. Always amazing what occurs to you when you sleep on it...if only I could take a good nap in the middle of the afternoon I would have no problems at all. On Jan 12, 2004, at 12:40 AM, Tom Lane wrote: > Andrew Rawnsley <ronz@ravensfield.com> writes: >> I have a situation that is giving me small fits, and would like to see >> if anyone can shed any light on it. > > In general, pulling 10% of a table *should* be faster as a seqscan than > an indexscan, except under the most extreme assumptions about > clustering > (is the table clustered on site_id, by any chance?). What I suspect is > that the table is a bit larger than your available RAM, so that a > seqscan ends up flushing all of the kernel's cache and forcing a lot of > I/O, whereas an indexscan avoids the cache flush by not touching > (quite) > all of the table. The trouble with this is that the index only looks > that good under test conditions, ie, when you repeat it just after an > identical query that pulled all of the needed pages into RAM. Under > realistic load conditions where different site_ids are being hit, the > indexscan is not going to be as good as you think, because it will > incur > substantial I/O. > > You should try setting up a realistic test load hitting different > random > site_ids, and see whether it's really a win to force seqscan off for > this query or not. > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -------------------- Andrew Rawnsley President The Ravensfield Digital Resource Group, Ltd. (740) 587-0114 www.ravensfield.com