Thread: Bitmap scans vs. the statistics views
I'm currently finding that the stats regression test fails with bitmapped index scans enabled: *** 62,68 **** WHERE st.relname='tenk2' AND cl.relname='tenk2'; ?column? | ?column? | ?column? | ?column? ----------+----------+----------+---------- ! t | t | t | t (1 row) SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages, --- 62,68 ---- WHERE st.relname='tenk2' AND cl.relname='tenk2'; ?column? | ?column? | ?column? | ?column? ----------+----------+----------+---------- ! t | t | f | f (1 row) SELECT st.heap_blks_read + st.heap_blks_hit >= pr.heap_blks + cl.relpages, The reason for this appears to be that the standard stats views disregard tuples_fetched for tables, but tuples_fetched is the only counter that's getting bumped in a bitmap scan. I could probably add some code to bump tuples_returned as well, but I wonder whether something more drastic isn't needed. The stats views seem to be designed around the assumption that there are seqscans and indexscans and nothing else. Do we need to expand the number of functions and rows to allow for a third basic scan type, or do we want to fuzz things up? Comments anyone? regards, tom lane
I would like to know if bitmap scans are being used on a table. I think it's worth adding to the stats views, though I'm not sure on the best way. For example, this means that a query on one table can now scan multiple indexes, but it doesn't seem right to lump that in with 'traditional' index scans. An extreme view would be to add new bitmap scan columns to pg_stat(io)_(tables|indexes), but that might be overkill. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
Tom, > The reason for this appears to be that the standard stats views > disregard tuples_fetched for tables, but tuples_fetched is the only > counter that's getting bumped in a bitmap scan. > > I could probably add some code to bump tuples_returned as well, > but I wonder whether something more drastic isn't needed. The > stats views seem to be designed around the assumption that there > are seqscans and indexscans and nothing else. Do we need to > expand the number of functions and rows to allow for a third basic > scan type, or do we want to fuzz things up? > > Comments anyone? Well, technically a bitmapscan is a different operation. So it should probably have its own columns. Unless you're talking about an overhaul of the stats views more drastic than that? If so, what? I'm not clear on why bitmapscan doesn't bump tuples_returned. Can you explain? -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > Well, technically a bitmapscan is a different operation. So it should > probably have its own columns. Unless you're talking about an overhaul of > the stats views more drastic than that? If so, what? That was basically what I was asking: do we expand all the stats support to handle this as a separate path, and if so what does it look like exactly? I'm particularly unclear what to do at the level of the functions described in http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-STATS-FUNCS-TABLE I've never fully understood the distinction the stats make between "tuples fetched" and "tuples returned", and it's even less obvious how to apply it when the index and heap operations are decoupled. In particular the function design assumes that heap tuple fetches can be tied to particular indexes, which is now a broken assumption. You might be amused by this test case I just finished debugging: regression=# explain analyze select * from tenk1, int4_tbl where unique1 in (40,50,f1) and unique2 >= 3999 and unique2 <9999; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------Nested Loop (cost=9.07..66.56 rows=4 width=248) (actual time=16.266..77.452 rows=6 loops=1) -> Seq Scan on int4_tbl (cost=0.00..1.05rows=5 width=4) (actual time=0.055..0.105 rows=5 loops=1) -> Bitmap Heap Scan on tenk1 (cost=9.07..13.08rows=1 width=244) (actual time=15.387..15.396 rows=1 loops=5) Recheck Cond: (((tenk1.unique1 = 40)OR (tenk1.unique1 = 50) OR (tenk1.unique1 = "outer".f1)) AND (tenk1.unique2 >= 3999) AND (tenk1.unique2 < 9999)) -> BitmapAnd (cost=9.07..9.07 rows=50 width=0) (actual time=15.353..15.353 rows=0 loops=5) -> BitmapOr (cost=6.52..6.52 rows=50 width=0) (actual time=0.152..0.152 rows=0 loops=5) -> Bitmap IndexScan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.059..0.059 rows=1 loops=5) Index Cond: (unique1 = 40) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0)(actual time=0.032..0.032 rows=1 loops=5) Index Cond: (unique1 = 50) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.029..0.029 rows=0 loops=5) Index Cond: (tenk1.unique1 = "outer".f1) -> Bitmap Index Scan on tenk1_unique2 (cost=0.00..2.30rows=50 width=0) (actual time=15.148..15.148 rows=6000 loops=5) Index Cond: ((unique2>= 3999) AND (unique2 < 9999))Total runtime: 78.369 ms (15 rows) What exactly do we want to count here? The 6000 TIDs pulled from tenk1_unique2 don't translate into much of anything at the heap access stage. (Shortly I'm going to add some logic to not bother using very nonselective index conditions in BitAnd, but it's not there right now.) > I'm not clear on why bitmapscan doesn't bump tuples_returned. Can you > explain? Well, there was no such bump in the bits of code I cribbed to make nodeBitmapHeapScan and friends ;-). It is easy enough to add, once we have a clear idea of what we want to count, but I don't feel that I have that idea yet. regards, tom lane
On 4/22/2005 3:30 PM, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: >> Well, technically a bitmapscan is a different operation. So it should >> probably have its own columns. Unless you're talking about an overhaul of >> the stats views more drastic than that? If so, what? > > That was basically what I was asking: do we expand all the stats support > to handle this as a separate path, and if so what does it look like > exactly? I'm particularly unclear what to do at the level of the > functions described in > http://www.postgresql.org/docs/8.0/static/monitoring-stats.html#MONITORING-STATS-FUNCS-TABLE > > I've never fully understood the distinction the stats make between > "tuples fetched" and "tuples returned", and it's even less obvious how > to apply it when the index and heap operations are decoupled. In > particular the function design assumes that heap tuple fetches can be > tied to particular indexes, which is now a broken assumption. You > might be amused by this test case I just finished debugging: tuples fetched is the number of raw, possibly dead tuples fetched from the heap. Tuples returned is the number of alive tuples ... IIRC. Jan > > regression=# explain analyze select * from tenk1, int4_tbl where unique1 in (40,50,f1) and unique2 >= 3999 and unique2< 9999; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=9.07..66.56 rows=4 width=248) (actual time=16.266..77.452 rows=6 loops=1) > -> Seq Scan on int4_tbl (cost=0.00..1.05 rows=5 width=4) (actual time=0.055..0.105 rows=5 loops=1) > -> Bitmap Heap Scan on tenk1 (cost=9.07..13.08 rows=1 width=244) (actual time=15.387..15.396 rows=1 loops=5) > Recheck Cond: (((tenk1.unique1 = 40) OR (tenk1.unique1 = 50) OR (tenk1.unique1 = "outer".f1)) AND (tenk1.unique2>= 3999) AND (tenk1.unique2 < 9999)) > -> BitmapAnd (cost=9.07..9.07 rows=50 width=0) (actual time=15.353..15.353 rows=0 loops=5) > -> BitmapOr (cost=6.52..6.52 rows=50 width=0) (actual time=0.152..0.152 rows=0 loops=5) > -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.059..0.059rows=1 loops=5) > Index Cond: (unique1 = 40) > -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.032..0.032rows=1 loops=5) > Index Cond: (unique1 = 50) > -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..2.17 rows=50 width=0) (actual time=0.029..0.029rows=0 loops=5) > Index Cond: (tenk1.unique1 = "outer".f1) > -> Bitmap Index Scan on tenk1_unique2 (cost=0.00..2.30 rows=50 width=0) (actual time=15.148..15.148 rows=6000loops=5) > Index Cond: ((unique2 >= 3999) AND (unique2 < 9999)) > Total runtime: 78.369 ms > (15 rows) > > What exactly do we want to count here? The 6000 TIDs pulled from > tenk1_unique2 don't translate into much of anything at the heap > access stage. (Shortly I'm going to add some logic to not bother > using very nonselective index conditions in BitAnd, but it's not there > right now.) > >> I'm not clear on why bitmapscan doesn't bump tuples_returned. Can you >> explain? > > Well, there was no such bump in the bits of code I cribbed to make > nodeBitmapHeapScan and friends ;-). It is easy enough to add, once > we have a clear idea of what we want to count, but I don't feel that > I have that idea yet. > > regards, tom lane -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Tom, > I've never fully understood the distinction the stats make between > "tuples fetched" and "tuples returned", and it's even less obvious how > to apply it when the index and heap operations are decoupled. Well, it's mainly a counter to measure how many dead rows are in your active data set. It probably belongs elsewhere, such as pg_stats_all_tables with the overall fetch counts. > In > particular the function design assumes that heap tuple fetches can be > tied to particular indexes, which is now a broken assumption. You > might be amused by this test case I just finished debugging: Well, I'd be in favor of moving the "useful" version of tuples_returned to pg_stat_*_tables as an overall count. However, we cant drop the column from pg_stat_indexes without breaking apps; at best, we'd have to warn people of the accuracy issues and give it a few versions before we dropped it. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Jan Wieck <JanWieck@Yahoo.com> writes: > tuples fetched is the number of raw, possibly dead tuples fetched from > the heap. Tuples returned is the number of alive tuples ... IIRC. No, count_heap_fetch only counts tuples that have already passed the snapshot test. It could be that the places where the counts are actually bumped don't line up with your original vision for the stats design. For a regular index scan, it seems to make sense to count (a) number of TIDs returned by the index AM, and (b) number of tuples returned by the IndexScan node. There are several intermediate steps* does the tuple pass the snapshot test* does the tuple pass any indexqualrechecks (for lossy indexes)* does the tuple pass any additional non-index restriction conditions that are beingenforced at the scan level but I'm not sure that counting all the intermediate steps is interesting. The places where the counts are actually taken don't quite line up with this ... there are also some other macros like count_heap_scan and count_index_scan that seem a bit randomly placed. The description in table 23-2 makes it sound like those should count once per scan, but they are bumped inside the per-tuple loops ... regards, tom lane
Josh Berkus <josh@agliodbs.com> writes: >> I've never fully understood the distinction the stats make between >> "tuples fetched" and "tuples returned", and it's even less obvious how >> to apply it when the index and heap operations are decoupled. > Well, it's mainly a counter to measure how many dead rows are in your active > data set. You may think that's what it is, but given where the counts are actually placed in the code, it does no such thing. AFAICS there is no count of dead tuples at all in the seqscan case, and in the indexscan case the only counter that advances before the snapshot test is pgstat_count_index_scan, which isn't counting tuples so much as amgetnext calls, and is documented in a way that suggests it does something completely different :-( regards, tom lane
On 4/22/2005 3:53 PM, Tom Lane wrote: > Jan Wieck <JanWieck@Yahoo.com> writes: >> tuples fetched is the number of raw, possibly dead tuples fetched from >> the heap. Tuples returned is the number of alive tuples ... IIRC. > > No, count_heap_fetch only counts tuples that have already passed the > snapshot test. It could be that the places where the counts are > actually bumped don't line up with your original vision for the > stats design. > > For a regular index scan, it seems to make sense to count (a) number of > TIDs returned by the index AM, and (b) number of tuples returned by the > IndexScan node. There are several intermediate steps > * does the tuple pass the snapshot test > * does the tuple pass any indexqual rechecks (for lossy indexes) > * does the tuple pass any additional non-index restriction > conditions that are being enforced at the scan level Now that you say it ... yes. The whole stats stuff was intended originally to find "DB tuning hints". A large number of tuples returned by index scan and filtered out by additional non-index restrictions indicate that there might be another multicolumn index missing. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
Tom, Hmmm ... we need to flag *something* in pg_stat_*_indexes, whether it is a new column or the tuplefetch column. People use that view to find indexes they can drop. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
On R, 2005-04-22 at 13:46 -0700, Josh Berkus wrote: > Tom, > > Hmmm ... we need to flag *something* in pg_stat_*_indexes, whether it is a new > column or the tuplefetch column. People use that view to find indexes they > can drop. I think that "idx_scan" and "idx_tup_read" can have the same meaning as for other index types, i.e. of number of scans and number of index tuples read (and copied into bitmap). and as idx_tup_fetch seems to be == idx_tup_read, it can be the same for bitmap indexes as well :) but if the view will be changed, I'd like to see column idx_pages_read added there. -- Hannu Krosing <hannu@tm.ee>
Quite some time ago I complained about the fact that bitmap index scans weren't being counted sanely by the statistics mechanism: http://archives.postgresql.org/pgsql-hackers/2005-04/msg00675.php That discussion trailed off without deciding how to fix it, but we really can't let this go without fixing it in 8.1. I studied the code some more and realized that we had been operating under some fundamental misconceptions. The distinction made in the existing stats code between "tuples fetched" and "tuples returned" has nothing whatever to do with live vs. dead tuples --- all these counts are made only after determining that a tuple is visible. The way it really works in 8.0 is: table "tuples_returned": tuples returned by heap_getnext, ie, live tuples found by seqscanstable "tuples_fetched": tuplesreturned by heap_fetch under conditions other than being invoked by an indexscan (this covers various randomcases like ANALYZE and TID scans)index "tuples_fetched": tuples returned by heap_fetch when invoked by an indexscanon this indexindex "tuples_returned": actually, exactly the same as tuples_fetched. This possibly explains why the original design of the pg_stat_all_tables view exposed only two of the seemingly four interesting counts. I have just committed changes that redefine the counts like this: table "tuples_returned": same as before, ie, live tuples found by seqscanstable "tuples_fetched": tuples returned by heap_fetchwhen invoked by a bitmap scan (the random other cases no longer get counted at all)index "tuples_fetched":same as before, ie, live tuples fetched by simple indexscans using this indexindex "tuples_returned":number of index entries returned from the index AM, counting both simple and bitmap scans. The pg_stat_all_tables view is modified to add the table's tuples_fetched count to the sum of the per-index tuples_fetched counts, so that idx_tup_fetch counts both simple and bitmap index scans. It's possible to break these out by looking at the low-level statistics functions, however. With the new definitions you can get some weak information about the numbers of dead tuples fetched by indexscans, which was not possible at all before. (It's weak because it's not easy to distinguish differences due to dead tuples from differences due to bitmap scanning.) In the earlier discussion, Josh commented that getting stats about dead tuples probably belongs somewhere else anyway, and I'm inclined to agree with that; so I don't feel too bad about not having provided more complete information. regards, tom lane