Thread: Extra cost of "lossy mode" Bitmap Scan plan
Hi. I found the current planner doesn't care about "lossy mode" on Bitmap Scan. I think costs of following two plans are equal now. (a) Having enough work_mem => normal Bitmap Scan. (b) Having less work_mem than estimated rows => Bitmap Scan with "lossy mode". Slower than (a). So, sometimes the planner makes a wrong choice. For example, on some queries the planner doesn't choose a faster Index Scan plan but a much slower Bitmap Scan (in actually lossy). My understanding is that we can know whether the plan is lossy or not like following. int tbm_maxentries = work_mem * 1024L;if (estimatedNumLows < tbm_maxentries) { /* not lossy */} else { /* lossy : wemay add some extra costs to total costs */} Any ideas how to do this? Best regards, ----- MINOWA Taro (Higepon) Cybozu Labs, Inc. http://www.monaos.org/ http://code.google.com/p/mosh-scheme/
higepon <higepon@gmail.com> wrote: > I found the current planner doesn't care about "lossy mode" on Bitmap Scan. Good point. I saw the bad behavior on DBT-3 (TPC-H) benchmark before. Loss-less bitmap scan was faster than seq Scan, but lossy bitmap scan was slower than seq Scan: EXPLAIN ANALYZE SELECT * FROM test WHERE v < 0.2; -- default Bitmap Heap Scan on test (cost=3948.42..11005.77rows=210588 width=8) (actual time=47.550..202.925 rows=200142) --SET work_mem=64 (NOTICE: the cost is same as above!) Bitmap Heap Scan on test (cost=3948.42..11005.77 rows=210588width=8) (actual time=52.057..358.145 rows=200142) -- SET enable_bitmapscan =off Seq Scan on test (cost=0.00..16924.70 rows=210588 width=8) (actual time=0.182..280.450rows=200142) > My understanding is that we can know whether the plan is lossy or not > like following. Sure, we need it! Also, I hope some methods to determine whether the bitmap scan was lossy or not, and how amount of work_mem is required to do loss-less bitmap scan. For example, a new GUC variable trace_bitmapscan to print the information of bitmap scan, like trace_sort for sorting. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
Hi. > Good point. I saw the bad behavior on DBT-3 (TPC-H) benchmark before. > Loss-less bitmap scan was faster than seq Scan, > but lossy bitmap scan was slower than seq Scan: Thank you. > Sure, we need it! Also, I hope some methods to determine whether the > bitmap scan was lossy or not, and how amount of work_mem is required to do > loss-less bitmap scan. For example, a new GUC variable trace_bitmapscan to > print the information of bitmap scan, like trace_sort for sorting. I think it is not hard to implement these methods and trace function. The problem that remains to be solved is "How much extra cost should we add for lossy mode?". Any ideas? Best regards, ----- MINOWA Taro (Higepon) Cybozu Labs, Inc. http://www.monaos.org/ http://code.google.com/p/mosh-scheme/
On Tue, Apr 28, 2009 at 7:45 AM, higepon <higepon@gmail.com> wrote: > "How much extra cost should we add for lossy mode?". There is something odd in this concern. Normally people aren't raising and lowering their work_mem so the comparison would be between a plan where the planner expects to see n records and a plan where the planner expects to see n+1 records where n would fit and n+1 wouldn't. It seems like an awfully narrow corner case where n records would be faster as a bitmap index scan but n+1 records would be faster as an index scan because the bitmap becomes lossy. The whole point of bitmap scans is that they're faster for large scans than index scans. If the logic you're suggesting would kick in at all it would be for a narrow range of scan sizes, so the net effect would be to use an index scan for small scans, then switch to a bitmap scan, then switch back to an index scan when the bitmap scan becomes lossy, then switch back to a lossy bitmap scan for large scans. I'm thinking that even if it's slightly faster when the planner has perfect inputs the downsides of switching back and forth might not be worth it. Especially when you consider than the planner is often going on approximate estimates and it is probably not switching in precisely the right spot. -- greg
Hi. > There is something odd in this concern. Normally people aren't raising > and lowering their work_mem so the comparison would be between a plan > where the planner expects to see n records and a plan where the > planner expects to see n+1 records where n would fit and n+1 wouldn't. I see. > It seems like an awfully narrow corner case where n records would be > faster as a bitmap index scan but n+1 records would be faster as an > index scan because the bitmap becomes lossy. The whole point of bitmap > scans is that they're faster for large scans than index scans. Hmm. Is this really a narrow corner case? At least I and ITAGAKI-san met. I think the default work_mem value (1MB) may cause these issues. Do you think there's no need for the planner to know whether the plan is lossy or not ? If the planner could know the costs of lossy mode, it may choose better plans than now. Or the second option is to show some hints to people who are doing performance tuning. (a) Write trace log when bitmap scans falls into "lossy" mode. (b) Show "lossy" or not on Explain results. Best regards, ----- MINOWA Taro (Higepon) Cybozu Labs, Inc. http://www.monaos.org/ http://code.google.com/p/mosh-scheme/ On Tue, Apr 28, 2009 at 4:51 PM, Greg Stark <stark@enterprisedb.com> wrote: > On Tue, Apr 28, 2009 at 7:45 AM, higepon <higepon@gmail.com> wrote: >> "How much extra cost should we add for lossy mode?". > > There is something odd in this concern. Normally people aren't raising > and lowering their work_mem so the comparison would be between a plan > where the planner expects to see n records and a plan where the > planner expects to see n+1 records where n would fit and n+1 wouldn't. > > It seems like an awfully narrow corner case where n records would be > faster as a bitmap index scan but n+1 records would be faster as an > index scan because the bitmap becomes lossy. The whole point of bitmap > scans is that they're faster for large scans than index scans. > > If the logic you're suggesting would kick in at all it would be for a > narrow range of scan sizes, so the net effect would be to use an index > scan for small scans, then switch to a bitmap scan, then switch back > to an index scan when the bitmap scan becomes lossy, then switch back > to a lossy bitmap scan for large scans. I'm thinking that even if it's > slightly faster when the planner has perfect inputs the downsides of > switching back and forth might not be worth it. Especially when you > consider than the planner is often going on approximate estimates and > it is probably not switching in precisely the right spot. > > -- > greg >
On Tue, Apr 28, 2009 at 3:51 AM, Greg Stark <stark@enterprisedb.com> wrote: > On Tue, Apr 28, 2009 at 7:45 AM, higepon <higepon@gmail.com> wrote: >> "How much extra cost should we add for lossy mode?". > > There is something odd in this concern. Normally people aren't raising > and lowering their work_mem so the comparison would be between a plan > where the planner expects to see n records and a plan where the > planner expects to see n+1 records where n would fit and n+1 wouldn't. > > It seems like an awfully narrow corner case where n records would be > faster as a bitmap index scan but n+1 records would be faster as an > index scan because the bitmap becomes lossy. The whole point of bitmap > scans is that they're faster for large scans than index scans. > > If the logic you're suggesting would kick in at all it would be for a > narrow range of scan sizes, so the net effect would be to use an index > scan for small scans, then switch to a bitmap scan, then switch back > to an index scan when the bitmap scan becomes lossy, then switch back > to a lossy bitmap scan for large scans. I'm thinking that even if it's > slightly faster when the planner has perfect inputs the downsides of > switching back and forth might not be worth it. Especially when you > consider than the planner is often going on approximate estimates and > it is probably not switching in precisely the right spot. You may be right, but on the other hand, I'm not sure there's any sense in NOT trying to model the impact of the additional heap fetches. Has anyone done a detailed analysis of the actual performance characteristics of bitmap scans versus index scans, as opposed to what the optimizer thinks? We've had long discussions on this topic in relation to both the posix_fadvise patch and the gin fast update patch, but I haven't seen a lot of hard numbers. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Apr 28, 2009 at 3:51 AM, Greg Stark <stark@enterprisedb.com> wrote: >> If the logic you're suggesting would kick in at all it would be for a >> narrow range of scan sizes, > You may be right, but on the other hand, I'm not sure there's any > sense in NOT trying to model the impact of the additional heap > fetches. I think it's probably useless. In the first place, at reasonable values of work_mem the effect is going to be negligible (in the sense that a plain indexscan would never win). In the second place, there isn't any way to guess the extent of lossiness at plan time --- it depends on how much the target rows are "clumped" on particular pages. The planner hasn't got any stats that would let it guess that, and even if we tried to collect such stats they'd probably be too unstable to be useful. There are boatloads of effects that the planner doesn't model. This one seems very far down the list of what we should worry about. regards, tom lane
On Tue, Apr 28, 2009 at 3:02 PM, Robert Haas <robertmhaas@gmail.com> wrote: > You may be right, but on the other hand, I'm not sure there's any > sense in NOT trying to model the impact of the additional heap > fetches. Yeah, the flip side of the argument is that we generally try to do the best job we can modeling costs and let the arithmetic work out how it however it does because you never know what kind of wacky situations will arise planning queries and and the better the estimates the better your chance of coming up with a good plan. For example the planner may have other join orders which allow it to avoid accessing those records entirely. So the comparison with a nested loop might not be the only comparison that matters. It might be a case of whether to run a bitmap scan against this table or some scan against another table to drive the join. I have been running benchmarks comparing bitmap heap scans against index scans amongst other comparisons. I haven't done CVS head yet but on an older version I'm seeing with effective_io_concurrency set to 0 scanning 1000 random tuples throughout a 130G table (one searched tuple per page) on a machine with 64G of ram after repeated executions index scans settle down to about 245s vs 205s for bitmap scans (for 100 iterations). So they're about 16% faster for this use case. Incidentally with effective_io_concurrency set to 30 on this 30-drive raid the bitmap scans go down to 17s :) -- greg
Hi. I apologize for this late reply. Tom Lane wrote: > I think it's probably useless. In the first place, at reasonable values > of work_mem the effect is going to be negligible (in the sense that a > plain indexscan would never win). In the second place, there isn't any > way to guess the extent of lossiness at plan time --- it depends on how > much the target rows are "clumped" on particular pages. The planner > hasn't got any stats that would let it guess that, and even if we tried > to collect such stats they'd probably be too unstable to be useful. Thank you. I understand your two points. But in this context, I can't understand why Bitmap Scan is faster than Index Scan. If there are many where conditions like following, I can understand why Bitmap Scan is faster than Index Scan.select * from emp where emp_no > 10000 and sex = 1 and salary> 5000; Bitmap Scan merges each result bitmaps and after the merge read tuples. But I don't understand this case.select * from emp where emp_no > 10000; Is Bitmap Scan is faster than Index Scan in this case ? If so, why ? My understanding that both Index Scan and Bitmap Scan scans index on emp_no and they are almost same cost. In addition Bitmap Scan creates a bitmap. So Bitmap Scan is a little bit slower? Best regards, ----- MINOWA Taro (Higepon) Cybozu Labs, Inc. http://www.monaos.org/ http://code.google.com/p/mosh-scheme/ On Tue, Apr 28, 2009 at 11:36 PM, Greg Stark <stark@enterprisedb.com> wrote: > On Tue, Apr 28, 2009 at 3:02 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> You may be right, but on the other hand, I'm not sure there's any >> sense in NOT trying to model the impact of the additional heap >> fetches. > > Yeah, the flip side of the argument is that we generally try to do the > best job we can modeling costs and let the arithmetic work out how it > however it does because you never know what kind of wacky situations > will arise planning queries and and the better the estimates the > better your chance of coming up with a good plan. > > For example the planner may have other join orders which allow it to > avoid accessing those records entirely. So the comparison with a > nested loop might not be the only comparison that matters. It might be > a case of whether to run a bitmap scan against this table or some scan > against another table to drive the join. > > I have been running benchmarks comparing bitmap heap scans against > index scans amongst other comparisons. I haven't done CVS head yet but > on an older version I'm seeing with effective_io_concurrency set to 0 > scanning 1000 random tuples throughout a 130G table (one searched > tuple per page) on a machine with 64G of ram after repeated executions > index scans settle down to about 245s vs 205s for bitmap scans (for > 100 iterations). So they're about 16% faster for this use case. > > Incidentally with effective_io_concurrency set to 30 on this 30-drive > raid the bitmap scans go down to 17s :) > > -- > greg >
higepon <higepon@gmail.com> writes: > But I don't understand this case. > select * from emp where emp_no > 10000; > Is Bitmap Scan is faster than Index Scan in this case ? Yes, very probably, if a lot of tuples are being retrieved. A bitmap scan will fetch the tuples from the heap in a more or less optimal fashion --- for instance, each page is read only once. Index scan will result in a random sequence of accesses to the heap. (Of course, it might have some order if the index is well correlated with the heap order, but most of the time that's not true.) regards, tom lane
Hi. > fashion --- for instance, each page is read only once. Index scan will > result in a random sequence of accesses to the heap. (Of course, it > might have some order if the index is well correlated with the heap > order, but most of the time that's not true.) Thank you. I finally understand the difference. Cheers.