Re: Bitmap scans vs. the statistics views - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Bitmap scans vs. the statistics views |
Date | |
Msg-id | 12175.1114198203@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Bitmap scans vs. the statistics views (Josh Berkus <josh@agliodbs.com>) |
Responses |
Re: Bitmap scans vs. the statistics views
Re: Bitmap scans vs. the statistics views |
List | pgsql-hackers |
Josh Berkus <josh@agliodbs.com> writes: > Well, technically a bitmapscan is a different operation. So it should > probably have its own columns. Unless you're talking about an overhaul of > the stats views more drastic than that? If so, what? That was basically what I was asking: do we expand all the stats support to handle this as a separate path, and if so what does it look like exactly? I'm particularly unclear what to do at the level of the functions described in http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-STATS-FUNCS-TABLE I've never fully understood the distinction the stats make between "tuples fetched" and "tuples returned", and it's even less obvious how to apply it when the index and heap operations are decoupled. In particular the function design assumes that heap tuple fetches can be tied to particular indexes, which is now a broken assumption. You might be amused by this test case I just finished debugging: regression=# explain analyze select * from tenk1, int4_tbl where unique1 in (40,50,f1) and unique2 >= 3999 and unique2 <9999; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------Nested Loop (cost=9.07..66.56 rows=4 width=248) (actual time=16.266..77.452 rows=6 loops=1) -> Seq Scan on int4_tbl (cost=0.00..1.05rows=5 width=4) (actual time=0.055..0.105 rows=5 loops=1) -> Bitmap Heap Scan on tenk1 (cost=9.07..13.08rows=1 width=244) (actual time=15.387..15.396 rows=1 loops=5) Recheck Cond: (((tenk1.unique1 = 40)OR (tenk1.unique1 = 50) OR (tenk1.unique1 = "outer".f1)) AND (tenk1.unique2 >= 3999) AND (tenk1.unique2 < 9999)) -> BitmapAnd (cost=9.07..9.07 rows=50 width=0) (actual time=15.353..15.353 rows=0 loops=5) -> BitmapOr (cost=6.52..6.52 rows=50 width=0) (actual time=0.152..0.152 rows=0 loops=5) -> Bitmap IndexScan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.059..0.059 rows=1 loops=5) Index Cond: (unique1 = 40) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0)(actual time=0.032..0.032 rows=1 loops=5) Index Cond: (unique1 = 50) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.029..0.029 rows=0 loops=5) Index Cond: (tenk1.unique1 = "outer".f1) -> Bitmap Index Scan on tenk1_unique2 (cost=0.00..2.30rows=50 width=0) (actual time=15.148..15.148 rows=6000 loops=5) Index Cond: ((unique2>= 3999) AND (unique2 < 9999))Total runtime: 78.369 ms (15 rows) What exactly do we want to count here? The 6000 TIDs pulled from tenk1_unique2 don't translate into much of anything at the heap access stage. (Shortly I'm going to add some logic to not bother using very nonselective index conditions in BitAnd, but it's not there right now.) > I'm not clear on why bitmapscan doesn't bump tuples_returned. Can you > explain? Well, there was no such bump in the bits of code I cribbed to make nodeBitmapHeapScan and friends ;-). It is easy enough to add, once we have a clear idea of what we want to count, but I don't feel that I have that idea yet. regards, tom lane
pgsql-hackers by date: