Thread: Problem with Bitmap Heap Scan
Simple select give wrong result when it uses the Bitmap Heap Scan path. <br /><br /><br /> postgres=# CREATE OR REPLACE FUNCTIONmy_exec_im_test_func(i integer) RETURNS integer AS $$ <br /> BEGIN <br /> RETURN i + 1;<br /> END;<br />$$ LANGUAGE plpgsql;<br />CREATE FUNCTION<br /><br />postgres=# setenable_seqscan=off;<br />SET<br />postgres=# set enable_indexscan=off;<br />SET<br />postgres=# select proname from pg_procwhere proname like 'my_pro1';<br /> proname <br />----------------------<br /> my_exec_im_test_proc<br/>(1 row)<br /><br />postgres=# explain select proname from pg_proc where proname like 'my_pro1';<br/> QUERY PLAN <br /> <br />--------------------------------------------------------------------------------<br />-------------<br /> Bitmap Heap Scanon pg_proc (cost=4.26..8.27 rows=1 width=64)<br /> Recheck Cond: (proname ~~ 'my_pro1'::text)<br /> -> BitmapIndex Scan on pg_proc_proname_args_nsp_index (cost=0.00..4.26 row<br />s=1 width=0)<br /> Index Cond: ((proname>= 'my'::name) AND (proname < 'mz'::name))<br />(4 rows)<br /><br /><br /><br clear="all" /><br />-- <br />RushabhLathia<br /><br /><a href="http://www.EnterpriseDB.com">www.EnterpriseDB.com</a>
<br />Analysis:<br />========<br /><br />While debugging bitmap heap scan (BitmapHeapNext function) found that first we performthe underlying index scan and then iterate over bitmap. Now while iterating, we execute ExecQual only if tbmres->recheckis true. And for the query tbmres->recheck is false.<br /><br />But from the query it seems that weshould execute ExecQual as we having "bitmpaqual" on the BitmapHeap node (Not quite sure). And when I manually did recheck= ture, query working fine and as expected.<br /><br /><br />Regards.<br />Rushabh<br /><a href="http://www.EnterpriseDB.com">www.EnterpriseDB.com</a><br/><br /><div class="gmail_quote">On Wed, Nov 19, 2008 at 4:26PM, Rushabh Lathia <span dir="ltr"><<a href="mailto:rushabh.lathia@gmail.com">rushabh.lathia@gmail.com</a>></span>wrote:<br /><blockquote class="gmail_quote"style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Simpleselect give wrong result when it uses the Bitmap Heap Scan path. <br /><br /><br /> postgres=# CREATE OR REPLACEFUNCTION my_exec_im_test_func(i integer) RETURNS integer AS $$ <br /> BEGIN <br /> RETURN i + 1;<br /> END;<br />$$ LANGUAGE plpgsql;<br />CREATE FUNCTION<br /><br />postgres=# setenable_seqscan=off;<br />SET<br />postgres=# set enable_indexscan=off;<br />SET<br />postgres=# select proname from pg_procwhere proname like 'my_pro1';<br /> proname <br />----------------------<br /> my_exec_im_test_proc<br/>(1 row)<br /><br />postgres=# explain select proname from pg_proc where proname like 'my_pro1';<br/> QUERY PLAN <br /> <br />--------------------------------------------------------------------------------<br />-------------<br /> Bitmap Heap Scanon pg_proc (cost=4.26..8.27 rows=1 width=64)<br /> Recheck Cond: (proname ~~ 'my_pro1'::text)<br /> -> BitmapIndex Scan on pg_proc_proname_args_nsp_index (cost=0.00..4.26 row<br />s=1 width=0)<br /> Index Cond: ((proname>= 'my'::name) AND (proname < 'mz'::name))<br />(4 rows)<br /><font color="#888888"><br /><br /><br clear="all"/><br />-- <br />Rushabh Lathia<br /><br /><a href="http://www.EnterpriseDB.com" target="_blank">www.EnterpriseDB.com</a></font></blockquote></div><br /><br clear="all" /><br />-- <br />Rushabh Lathia<br/>
"Rushabh Lathia" <rushabh.lathia@gmail.com> writes: > Simple select give wrong result when it uses the Bitmap Heap Scan path. It's generally appropriate to mention which PG version you're working with when you report a bug. > postgres=# explain select proname from pg_proc where proname like 'my_pro1'; > QUERY > PLAN > -------------------------------------------------------------------------------- > ------------- > Bitmap Heap Scan on pg_proc (cost=4.26..8.27 rows=1 width=64) > Recheck Cond: (proname ~~ 'my_pro1'::text) > -> Bitmap Index Scan on pg_proc_proname_args_nsp_index (cost=0.00..4.26 > row > s=1 width=0) > Index Cond: ((proname >= 'my'::name) AND (proname < 'mz'::name)) > (4 rows) Hmm, the ~~ condition should get treated as a "filter" not a "recheck". I suppose I broke this somewhere ... regards, tom lane
Tom Lane wrote: <blockquote cite="mid:17828.1227120386@sss.pgh.pa.us" type="cite"><pre wrap="">"Rushabh Lathia" <a class="moz-txt-link-rfc2396E"href="mailto:rushabh.lathia@gmail.com"><rushabh.lathia@gmail.com></a> writes: </pre><blockquotetype="cite"><pre wrap="">Simple select give wrong result when it uses the Bitmap Heap Scan path. </pre></blockquote><prewrap=""> It's generally appropriate to mention which PG version you're working with when you report a bug.<tt></tt> </pre></blockquote><tt>I was on 8.4 CVS Head, Next time will sure take care of this.</tt><br /><blockquote cite="mid:17828.1227120386@sss.pgh.pa.us" type="cite"><pre wrap=""> </pre><blockquote type="cite"><prewrap="">postgres=# explain select proname from pg_proc where proname like 'my_pro1'; QUERY PLAN </pre></blockquote><pre wrap=""> </pre><blockquote type="cite"><pre wrap="">-------------------------------------------------------------------------------- -------------Bitmap Heap Scan on pg_proc (cost=4.26..8.27 rows=1 width=64) Recheck Cond: (proname ~~ 'my_pro1'::text) -> Bitmap Index Scan on pg_proc_proname_args_nsp_index (cost=0.00..4.26 row s=1 width=0) Index Cond: ((proname >= 'my'::name) AND (proname < 'mz'::name)) (4 rows) </pre></blockquote><pre wrap=""> Hmm, the ~~ condition should get treated as a "filter" not a "recheck". I suppose I broke this somewhere ... </pre></blockquote><tt>Oh ok. </tt><br /><blockquote cite="mid:17828.1227120386@sss.pgh.pa.us"type="cite"><pre wrap=""> regards, tom lane </pre></blockquote><br />
Tom Lane wrote: > "Rushabh Lathia" <rushabh.lathia@gmail.com> writes: >> Simple select give wrong result when it uses the Bitmap Heap Scan path. > > It's generally appropriate to mention which PG version you're working > with when you report a bug. > >> postgres=# explain select proname from pg_proc where proname like 'my_pro1'; >> QUERY >> PLAN > >> -------------------------------------------------------------------------------- >> ------------- >> Bitmap Heap Scan on pg_proc (cost=4.26..8.27 rows=1 width=64) >> Recheck Cond: (proname ~~ 'my_pro1'::text) >> -> Bitmap Index Scan on pg_proc_proname_args_nsp_index (cost=0.00..4.26 >> row >> s=1 width=0) >> Index Cond: ((proname >= 'my'::name) AND (proname < 'mz'::name)) >> (4 rows) > > Hmm, the ~~ condition should get treated as a "filter" not a "recheck". > I suppose I broke this somewhere ... I started to look at this last night. The culprit seems to be this patch: > Author: Tom Lane <tgl@sss.pgh.pa.us> > Date: Sun Apr 13 19:18:14 2008 +0000 > > Phase 2 of project to make index operator lossiness be determined at runtime > instead of plan time. Extend the amgettuple API so that the index AM returns > a boolean indicating whether the indexquals need to be rechecked, and make > that rechecking happen in nodeIndexscan.c (currently the only place where > it's expected to be needed; other callers of index_getnext are just erroring > out for now). For the moment, GIN and GIST have stub logic that just always > sets the recheck flag to TRUE --- I'm hoping to get Teodor to handle pushing > that control down to the opclass consistent() functions. The planner no > longer pays any attention to amopreqcheck, and that catalog column will go > away in due course. and the changes around create_bitmap_scan_plan in particular. create_bitmap_subplan puts the original ~~ qual into the recheck condition, even though the indexqual is only ((proname >= 'my'::name) AND (proname < 'mz'::name)). So, the condition put into the recheck condition is stronger than the checked by the index. create_bitmap_scan_plan puts all index clauses that are not in the the Recheck condition into the Filter. If the condition in the recheck condition is stronger than the condition normally checked by the index, that's wrong. Attached is a patch that changes create_bitmap_subplan so that the condition put into Recheck condition is never stronger than the condition automatically handled by the index. Does that look right to you? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com *** src/backend/optimizer/plan/createplan.c --- src/backend/optimizer/plan/createplan.c *************** *** 1197,1202 **** create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual, --- 1197,1203 ---- IndexPath *ipath = (IndexPath *) bitmapqual; IndexScan *iscan; ListCell *l; + List *scan_clauses; /* Use the regular indexscan plan build machinery... */ iscan = create_indexscan_plan(root, ipath, NIL, NIL); *************** *** 1210,1216 **** create_bitmap_subplan(PlannerInfo *root, Path *bitmapqual, plan->plan_rows = clamp_row_est(ipath->indexselectivity * ipath->path.parent->tuples); plan->plan_width = 0; /* meaningless */ ! *qual = get_actual_clauses(ipath->indexclauses); foreach(l, ipath->indexinfo->indpred) { Expr *pred = (Expr *) lfirst(l); --- 1211,1233 ---- plan->plan_rows = clamp_row_est(ipath->indexselectivity * ipath->path.parent->tuples); plan->plan_width = 0; /* meaningless */ ! ! /* ! * Put those indexquals that are automatically handled by the index to ! * the Recheck condition. Don't include clauses that are derived from, ! * but not directly included in the original scan quals. The original ! * clause they're derived from need to be checked anyway in the Filter, ! * even for non-lossy bitmaps. ! */ ! scan_clauses = get_actual_clauses(ipath->indexclauses); ! *qual = NIL; ! foreach(l, iscan->indexqualorig) ! { ! Expr *q = (Expr *) lfirst(l); ! if (list_member(scan_clauses, q)) ! *qual = lappend(*qual, q); ! } ! foreach(l, ipath->indexinfo->indpred) { Expr *pred = (Expr *) lfirst(l);
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > Tom Lane wrote: >> Hmm, the ~~ condition should get treated as a "filter" not a "recheck". >> I suppose I broke this somewhere ... > I started to look at this last night. The culprit seems to be this patch: Yeah, it appears that I oversimplified matters in that patch. I think that I mistakenly decided that create_bitmap_subplan() only needed one output parameter because there was no longer a need to distinguish between lossy and nonlossy operators. What I forgot was that the "nonlossy" output was also coming from the "indexquals" rather than the "indexclauses", and that's different precisely in the case where we've got a special index operator such as LIKE. I think what probably has to happen is revert most of that simplification and have create_bitmap_subplan return suitable representations of both the indexquals and indexclauses. > Attached is a patch that changes create_bitmap_subplan so that the > condition put into Recheck condition is never stronger than the > condition automatically handled by the index. Does that look right to you? I think this is still too simplistic, but will look closer. One point is that it's not accounting for the bitmap AND/OR structure that might be above the individual indexscans. The original coding avoided that problem by making all the decisions at the top level, and I'm inclined to stick with that approach. regards, tom lane
I wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> Attached is a patch that changes create_bitmap_subplan so that the >> condition put into Recheck condition is never stronger than the >> condition automatically handled by the index. Does that look right to you? > I think this is still too simplistic, but will look closer. One point > is that it's not accounting for the bitmap AND/OR structure that might > be above the individual indexscans. The original coding avoided that > problem by making all the decisions at the top level, and I'm inclined > to stick with that approach. I've fixed this by reverting create_bitmap_subplan to its previous behavior with two output lists. The scheme you suggested is a bit logically cleaner, but aside from the issue of AND/OR conditions it has one unpleasant feature: the RECHECK condition would get copies of derived clauses. For instance "col LIKE 'foo%'" would end up with a plan like Recheck: col >= 'foo' AND col < 'fop'Filter: col ~~ 'foo%' Index Cond: col >= 'foo' AND col < 'fop' The tests in create_bitmap_scan_plan aren't smart enough to recognize that the recheck conditions are redundant given the filter condition. The former and now-restored behavior avoids this problem, though it has the assumption that every indexclause condition came from scan_clauses (or has been put into bitmapqualorig in the join case), else it might fail to enforce special operators. That's certainly true at the moment though it seems a bit ugly to assume it here. It might be a good idea to try to refactor the representation of special/derived quals to make this stuff a bit more straightforward. I don't care to tackle that now though. regards, tom lane