Re: [GENERAL] explain analyze showed improved results withoutchanges, why? - Mailing list pgsql-general

From Peter J. Holzer
Subject Re: [GENERAL] explain analyze showed improved results withoutchanges, why?
Date
Msg-id 20161225091547.GA18260@hjp.at
Whole thread Raw
In response to [GENERAL] explain analyze showed improved results without changes, why?  (Chris Richards <chris@infinite.io>)
List pgsql-general
On 2016-12-23 10:35:26 -0600, Chris Richards wrote:
> Howdy. I was noticing a significant problem with a query on one of my tables. I
> tried recreating the problem and wasn't able to do so on a different install,
> and so a few days later I ran the same query on the problem table. Lo' and
> behold, there wasn't a problem anymore. I'm at a loss to why.
[...]
>     "blocks_off_sz_idx" btree (off, sz)
>
> mdb=> explain analyze SELECT * FROM blocks
>  WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded'
>  ORDER BY off LIMIT 1 FOR UPDATE;
>                                                                       QUERY
> PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.43..587.83 rows=1 width=100) (actual time=4814.579..4814.579 rows=1 loops=1)
>    ->  LockRows  (cost=0.43..1358633.99 rows=2313 width=100) (actual time= 4814.577..4814.577 rows=1 loops=1)
>          ->  Index Scan using blocks_off_sz_idx on blocks  (cost= 0.43..1358610.86 rows=2313 width=100) (actual
time=4813.498..4814.384rows=2 loops=1) 
>                Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0))
>                Rows Removed by Filter: 6935023
>  Total runtime: 4814.619 ms
> (6 rows)

This scans the table in ascending (off, sz) order until it finds one row
matching the filter. Apparently at the time of the query there were
6935023 rows in the table before the matching row.

[...]
> And here's the second. Notice that even though there are more rows, it was much
> faster and the "rows removed by filter" were significantly reduced by several
> orders of magnitude.
>
>
> mdb=> explain analyze SELECT * FROM blocks
>  WHERE cloudidx=98038 AND off+sz >= 0 AND state='seeded'
>  ORDER BY off LIMIT 1 FOR UPDATE;
>                                                                    QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.43..584.57 rows=1 width=100) (actual time=0.071..0.071 rows=1 loops=1)
>    ->  LockRows  (cost=0.43..1390825.21 rows=2381 width=100) (actual time=0.070..0.070 rows=1 loops=1)
>          ->  Index Scan using blocks_off_sz_idx on blocks  (cost=0.43..1390801.40 rows=2381 width=100) (actual
time=0.055..0.055rows=1 loops=1) 
>                Filter: ((cloudidx = 98038) AND (state = 'seeded'::block_state) AND ((off + sz) >= 0))
>                Rows Removed by Filter: 26
>  Total runtime: 0.114 ms
> (6 rows)

The plan here is exactly the same, but only 26 rows are discarded. My
guess is that between those two queries a row was inserted with a really
low (off, sz) value which matches the query. So now the query can return
after checking only a handful of rows.

LIMIT, EXISTS, etc. are awful when you want predictable performance. You
may be lucky and the rows you are looking for are just at the start or
you may be unlucky and you have to scan through the whole table to find
them. The optimizer (usually) doesn't have enough information and
assumes they are spread randomly through the table.

        hp

--
   _  | Peter J. Holzer    | A coding theorist is someone who doesn't
|_|_) |                    | think Alice is crazy.
| |   | hjp@hjp.at         | -- John Gordon
__/   | http://www.hjp.at/ |    http://downlode.org/Etext/alicebob.html

Attachment

pgsql-general by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [GENERAL] Is there a reason the "-v/verbose" switch is notprinted with pg_dumpall --help
Next
From: Günce Kaya
Date:
Subject: Re: [GENERAL] Er Data Modeller for PostgreSQL