Thread: [GENERAL] explain analyze showed improved results without changes, why?
Howdy. I was noticing a significant problem with a query on one of my tables. I tried recreating the problem and wasn't able to do so on a different install, and so a few days later I ran the same query on the problem table. Lo' and behold, there wasn't a problem anymore. I'm at a loss to why. The indices and two explains follow below. Thanks in advance for the help.
Cheers,
Chris
Indexes:
"blocks_pkey" PRIMARY KEY, btree (cloudidx, blkid)
"blocks_blkid_idx" btree (blkid)
"blocks_cloudidx_idx" btree (cloudidx)
"blocks_off_sz_idx" btree (off, sz)
mdb=> explain analyze SELECT * FROM blocks
WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded'
ORDER BY off LIMIT 1 FOR UPDATE;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..587.83 rows=1 width=100) (actual time=4814.579..4814.579 rows=1 loops=1)
-> LockRows (cost=0.43..1358633.99 rows=2313 width=100) (actual time=4814.577..4814.577 rows=1 loops=1)
-> Index Scan using blocks_off_sz_idx on blocks (cost=0.43..1358610.86 rows=2313 width=100) (actual time=4813.498..4814.384 rows=2 loops=1)
Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0))
Rows Removed by Filter: 6935023
Total runtime: 4814.619 ms
(6 rows)
mdb=> select count(*) from blocks;
count
----------
11052135
(1 row)
mdb=> select count (distinct (cloudidx)) from blocks;
count
-------
4549
(1 row)
And here's the second. Notice that even though there are more rows, it was much faster and the "rows removed by filter" were significantly reduced by several orders of magnitude.
mdb=> explain analyze SELECT * FROM blocks WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded' ORDER BY off LIMIT 1 FOR UPDATE; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.43..584.57 rows=1 width=100) (actual time=0.071..0.071 rows=1 loops=1) -> LockRows (cost=0.43..1390825.21 rows=2381 width=100) (actual time=0.070..0.070 rows=1 loops=1) -> Index Scan using blocks_off_sz_idx on blocks (cost=0.43..1390801.40 rows=2381 width=100) (actual time=0.055..0.055 rows=1 loops=1) Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0)) Rows Removed by Filter: 26 Total runtime: 0.114 ms (6 rows) mdb=> select count(*) from blocks; count ---------- 11328801 (1 row) mdb=> select count (distinct (cloudidx)) from blocks; count ------- 4613 (1 row)
Re: [GENERAL] explain analyze showed improved results withoutchanges, why?
From
"Peter J. Holzer"
Date:
On 2016-12-23 10:35:26 -0600, Chris Richards wrote: > Howdy. I was noticing a significant problem with a query on one of my tables. I > tried recreating the problem and wasn't able to do so on a different install, > and so a few days later I ran the same query on the problem table. Lo' and > behold, there wasn't a problem anymore. I'm at a loss to why. [...] > "blocks_off_sz_idx" btree (off, sz) > > mdb=> explain analyze SELECT * FROM blocks > WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded' > ORDER BY off LIMIT 1 FOR UPDATE; > QUERY > PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.43..587.83 rows=1 width=100) (actual time=4814.579..4814.579 rows=1 loops=1) > -> LockRows (cost=0.43..1358633.99 rows=2313 width=100) (actual time= 4814.577..4814.577 rows=1 loops=1) > -> Index Scan using blocks_off_sz_idx on blocks (cost= 0.43..1358610.86 rows=2313 width=100) (actual time=4813.498..4814.384rows=2 loops=1) > Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0)) > Rows Removed by Filter: 6935023 > Total runtime: 4814.619 ms > (6 rows) This scans the table in ascending (off, sz) order until it finds one row matching the filter. Apparently at the time of the query there were 6935023 rows in the table before the matching row. [...] > And here's the second. Notice that even though there are more rows, it was much > faster and the "rows removed by filter" were significantly reduced by several > orders of magnitude. > > > mdb=> explain analyze SELECT * FROM blocks > WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded' > ORDER BY off LIMIT 1 FOR UPDATE; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.43..584.57 rows=1 width=100) (actual time=0.071..0.071 rows=1 loops=1) > -> LockRows (cost=0.43..1390825.21 rows=2381 width=100) (actual time=0.070..0.070 rows=1 loops=1) > -> Index Scan using blocks_off_sz_idx on blocks (cost=0.43..1390801.40 rows=2381 width=100) (actual time=0.055..0.055rows=1 loops=1) > Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0)) > Rows Removed by Filter: 26 > Total runtime: 0.114 ms > (6 rows) The plan here is exactly the same, but only 26 rows are discarded. My guess is that between those two queries a row was inserted with a really low (off, sz) value which matches the query. So now the query can return after checking only a handful of rows. LIMIT, EXISTS, etc. are awful when you want predictable performance. You may be lucky and the rows you are looking for are just at the start or you may be unlucky and you have to scan through the whole table to find them. The optimizer (usually) doesn't have enough information and assumes they are spread randomly through the table. hp -- _ | Peter J. Holzer | A coding theorist is someone who doesn't |_|_) | | think Alice is crazy. | | | hjp@hjp.at | -- John Gordon __/ | http://www.hjp.at/ | http://downlode.org/Etext/alicebob.html