Thread: index & Bitmap Heap Scan
Hello,
I have a table (stats.tickets) with 2288965 rows (51 columns) and indexes like:
ind_ti_stats_numero btree (tday, tmonth, tyear, r_cat, r_numero)
ind_ti_stats_service btree (tday, tmonth, tyear, r_cat, r_service)
ind_ti_stats_tmp_service btree (r_service, tyear, tmonth)
ind_ti_stats_tmp_service2 btree (r_service, tyear, tmonth, r_cat)
Now if i do :
1°)# explain analyze SELECT tday AS n, '' AS class, a.r_cat AS cat, COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1 END) AS p, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END) AS np, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE t_duree1 END),0) AS tmc FROM stats.tickets AS a WHERE a.r_numero='99084040' AND tyear = 2007 AND tmonth = 8 GROUP BY tyear, tmonth, tday, a.r_cat;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=45412.96..45412.99 rows=1 width=34) (actual time=649.944..650.178 rows=50 loops=1)
-> Index Scan using ind_ti_stats_numero on tickets a (cost=0.00..45385.46 rows=1222 width=34) (actual time=15.697..642.570 rows=1043 loops=1)
Index Cond: ((tmonth = 8) AND (tyear = 2007) AND ((r_numero)::text = '99084040'::text))
Total runtime: 650.342 ms
(4 lignes)
Temps : 652,234 ms
2°)
# explain analyze SELECT tday AS n, '' AS class, a.r_cat AS cat, COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1 END) AS p, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END) AS np, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE t_duree1 END),0) AS tmc FROM stats.tickets AS a WHERE a.r_service=95 AND tyear = 2007 AND tmonth = 8 GROUP BY tyear, tmonth, tday, a.r_cat;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=193969.97..193970.88 rows=26 width=34) (actual time=20834.559..20834.694 rows=27 loops=1)
-> Bitmap Heap Scan on tickets a (cost=3714.84..186913.32 rows=313629 width=34) (actual time=889.880..19028.315 rows=321395 loops=1)
Recheck Cond: ((r_service = 95) AND (tyear = 2007) AND (tmonth = 8))
-> Bitmap Index Scan on ind_ti_stats_tmp_service (cost=0.00..3714.84 rows=313629 width=0) (actual time=836.181..836.181 rows=321395 loops=1)
Index Cond: ((r_service = 95) AND (tyear = 2007) AND (tmonth = 8))
Total runtime: 20835.191 ms
(6 lignes)
Temps : 20838,798 ms
\d stats.tickets
[...]
r_numero | character varying(17) | not null
r_service | integer | not null default 0
[...]
stats.tickets has 173351 relpages , 2.30996e+06 reltuples.
Why in the first case, pgsql uses the "better" index and if i search r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?
There ara too much rows in this table ?
PS: sorry for my english, i'm french.
--
Paul.
I have a table (stats.tickets) with 2288965 rows (51 columns) and indexes like:
ind_ti_stats_numero btree (tday, tmonth, tyear, r_cat, r_numero)
ind_ti_stats_service btree (tday, tmonth, tyear, r_cat, r_service)
ind_ti_stats_tmp_service btree (r_service, tyear, tmonth)
ind_ti_stats_tmp_service2 btree (r_service, tyear, tmonth, r_cat)
Now if i do :
1°)# explain analyze SELECT tday AS n, '' AS class, a.r_cat AS cat, COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1 END) AS p, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END) AS np, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE t_duree1 END),0) AS tmc FROM stats.tickets AS a WHERE a.r_numero='99084040' AND tyear = 2007 AND tmonth = 8 GROUP BY tyear, tmonth, tday, a.r_cat;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=45412.96..45412.99 rows=1 width=34) (actual time=649.944..650.178 rows=50 loops=1)
-> Index Scan using ind_ti_stats_numero on tickets a (cost=0.00..45385.46 rows=1222 width=34) (actual time=15.697..642.570 rows=1043 loops=1)
Index Cond: ((tmonth = 8) AND (tyear = 2007) AND ((r_numero)::text = '99084040'::text))
Total runtime: 650.342 ms
(4 lignes)
Temps : 652,234 ms
2°)
# explain analyze SELECT tday AS n, '' AS class, a.r_cat AS cat, COUNT(*) AS cnx, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE 1 END) AS p, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 1 ELSE 0 END) AS np, SUM(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE r_duree END) AS tps, ROUND(AVG(CASE WHEN t_duree1 < r_palier_min_con THEN 0 ELSE t_duree1 END),0) AS tmc FROM stats.tickets AS a WHERE a.r_service=95 AND tyear = 2007 AND tmonth = 8 GROUP BY tyear, tmonth, tday, a.r_cat;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=193969.97..193970.88 rows=26 width=34) (actual time=20834.559..20834.694 rows=27 loops=1)
-> Bitmap Heap Scan on tickets a (cost=3714.84..186913.32 rows=313629 width=34) (actual time=889.880..19028.315 rows=321395 loops=1)
Recheck Cond: ((r_service = 95) AND (tyear = 2007) AND (tmonth = 8))
-> Bitmap Index Scan on ind_ti_stats_tmp_service (cost=0.00..3714.84 rows=313629 width=0) (actual time=836.181..836.181 rows=321395 loops=1)
Index Cond: ((r_service = 95) AND (tyear = 2007) AND (tmonth = 8))
Total runtime: 20835.191 ms
(6 lignes)
Temps : 20838,798 ms
\d stats.tickets
[...]
r_numero | character varying(17) | not null
r_service | integer | not null default 0
[...]
stats.tickets has 173351 relpages , 2.30996e+06 reltuples.
Why in the first case, pgsql uses the "better" index and if i search r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ?
There ara too much rows in this table ?
PS: sorry for my english, i'm french.
--
Paul.
Paul <paul@wayr.org> writes: > Why in the first case, pgsql uses the "better" index and if i search > r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ? Given the difference in the number of rows to be fetched, both plan choices look pretty reasonable to me. If you want to experiment, you can try forcing the other choice in each case (use enable_indexscan and enable_bitmapscan) and see how fast it is, but I suspect the planner got it right. Beware of cache effects when trying two plans in quick succession --- the second one might go faster just because all the data is already swapped in. regards, tom lane
Thank you for your answer. Now i ve to find how to reduce the size of the table. Paul. Le mardi 28 août 2007 à 12:55 -0400, Tom Lane a écrit : > Paul <paul@wayr.org> writes: > > Why in the first case, pgsql uses the "better" index and if i search > > r_service instead of r_numero pgsql does a "Bitmap Heap scan" first ? > > Given the difference in the number of rows to be fetched, both plan > choices look pretty reasonable to me. If you want to experiment, > you can try forcing the other choice in each case (use enable_indexscan > and enable_bitmapscan) and see how fast it is, but I suspect the planner > got it right. > > Beware of cache effects when trying two plans in quick succession --- > the second one might go faster just because all the data is already > swapped in. > > regards, tom lane >