Thread: Slow performance after restoring a dump

Slow performance after restoring a dump

From

David Osborne

Date:

19 March 2018, 18:13:37

Hi, I was wondering if someone could help us work out why this query is so slow.

We've just dumped a database (Postgresql 9.1) and restored it to a new instance (AWS RDS 9.6) (via pg_dump, restored to psql)

We immediately see that the following fairly straightforward query is now extremely slow with a huge number of shared buffers hit.

On the new instance it takes 25 seconds. On the original system, 0.05 seconds.

The database schemas are both the same (with the same columns indexed) so I'm guessing it must be a configuration issue to make the planner go down a different route.

I have run an analyse on the whole database since restoring.

Also, if on the new instance, I disable indexscans, the query will take 0.047 seconds.

Can someone point us in the right direction on what's going on here?

Query:

explain (buffers,analyse) select

trans_date

from stock_trans s

join account_trans a using(account_trans_id)

where product_id=100

and credit_stock_account_id=3

order by trans_date desc

limit 1;

Bad Performance on New Instance:

https://explain.depesz.com/s/0HXq

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=0.87..9086.72 rows=1 width=4) (actual time=25829.287..25829.287 rows=0 loops=1)

Buffers: shared hit=43944066

-> Nested Loop (cost=0.87..6196547.28 rows=682 width=4) (actual time=25829.286..25829.286 rows=0 loops=1)

Buffers: shared hit=43944066

-> Index Scan Backward using account_trans_date_idx on account_trans a (cost=0.43..392996.60 rows=11455133 width=8) (actual time=0.007..3401.027 rows=11455133 loops=1)

Buffers: shared hit=251082

-> Index Scan using stock_trans_account_trans_idx on stock_trans s (cost=0.43..0.50 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=11455133)

Index Cond: (account_trans_id = a.account_trans_id)

Filter: ((product_id = 2466420) AND (credit_stock_account_id = 3))

Rows Removed by Filter: 1

Buffers: shared hit=43692984

Planning time: 0.271 ms

Execution time: 25829.316 ms

(13 rows)

Disabled indexscan:

=> set enable_indexscan=off;

https://explain.depesz.com/s/zTVn

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=11049.80..11049.81 rows=1 width=4) (actual time=0.018..0.018 rows=0 loops=1)

Buffers: shared hit=3

-> Sort (cost=11049.80..11051.51 rows=682 width=4) (actual time=0.017..0.017 rows=0 loops=1)

Sort Key: a.trans_date DESC

Sort Method: quicksort Memory: 25kB

Buffers: shared hit=3

-> Nested Loop (cost=35.99..11046.39 rows=682 width=4) (actual time=0.011..0.011 rows=0 loops=1)

Buffers: shared hit=3

-> Bitmap Heap Scan on stock_trans s (cost=31.59..5301.09 rows=682 width=4) (actual time=0.011..0.011 rows=0 loops=1)

Recheck Cond: (product_id = 2466420)

Filter: (credit_stock_account_id = 3)

Buffers: shared hit=3

-> Bitmap Index Scan on stock_trans_product_idx (cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0 loops=1)

Index Cond: (product_id = 2466420)

Buffers: shared hit=3

-> Bitmap Heap Scan on account_trans a (cost=4.40..8.41 rows=1 width=8) (never executed)

Recheck Cond: (account_trans_id = s.account_trans_id)

-> Bitmap Index Scan on account_trans_pkey (cost=0.00..4.40 rows=1 width=0) (never executed)

Index Cond: (account_trans_id = s.account_trans_id)

Planning time: 0.272 ms

Execution time: 0.047 ms

(21 rows)

Explain from the same query on the original database:

https://explain.depesz.com/s/WHKJ

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=9811.51..9811.52 rows=1 width=4) (actual time=0.020..0.020 rows=0 loops=1)

Buffers: shared hit=3

-> Sort (cost=9811.51..9813.23 rows=685 width=4) (actual time=0.019..0.019 rows=0 loops=1)

Sort Key: a.trans_date

Sort Method: quicksort Memory: 25kB

Buffers: shared hit=3

-> Nested Loop (cost=0.00..9808.09 rows=685 width=4) (actual time=0.014..0.014 rows=0 loops=1)

Buffers: shared hit=3

-> Index Scan using stock_trans_product_idx on stock_trans s (cost=0.00..3300.01 rows=685 width=4) (actual time=0.014..0.014 rows=0 loops=1)

Index Cond: (product_id = 2466420)

Filter: (credit_stock_account_id = 3)

Buffers: shared hit=3

-> Index Scan using account_trans_pkey on account_trans a (cost=0.00..9.49 rows=1 width=8) (never executed)

Index Cond: (account_trans_id = s.account_trans_id)

Total runtime: 0.050 ms

(15 rows)

Regards,

David

Re: Slow performance after restoring a dump

From

Tom Lane

Date:

19 March 2018, 18:35:00

David Osborne <david@qcode.co.uk> writes:
> Hi, I was wondering if someone could help us work out why this query is so
> slow.
> We've just dumped a database (Postgresql 9.1) and restored it to a new
> instance (AWS RDS 9.6) (via pg_dump, restored to psql)

The first question people will ask is did you re-ANALYZE the new
database?  pg_dump doesn't take care of that for you, and auto-analyze
might not think it needs to process the smaller tables.

            regards, tom lane

Re: Slow performance after restoring a dump

From

David Osborne

Date:

19 March 2018, 18:43:47

Hi, yes I've run "analyse" against the newly restored database. Should that be enough?

On 19 March 2018 at 15:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Osborne <david@qcode.co.uk> writes:

The first question people will ask is did you re-ANALYZE the new
database? pg_dump doesn't take care of that for you, and auto-analyze
might not think it needs to process the smaller tables.

Re: Slow performance after restoring a dump

From

Tom Lane

Date:

19 March 2018, 19:22:00

David Osborne <david@qcode.co.uk> writes:
> Hi, yes I've run "analyse" against the newly restored database. Should that
> be enough?

My apologies, you did say that further down in the original message.
It looks like the core of the problem is the poor rowcount estimation
here:

                     ->  Bitmap Index Scan on stock_trans_product_idx (cost=0.00..31.42 rows=1465 width=0) (actual
time=0.009..0.009rows=0 loops=1) 
                           Index Cond: (product_id = 2466420)
                           Buffers: shared hit=3

You might be able to improve that by raising the statistics target
for stock_trans.product_id.  I'm not sure why you weren't getting
bitten by the same issue in 9.1; but the cost estimates aren't
that far apart for the two plans, so maybe you were just lucky ...

            regards, tom lane

Re: Slow performance after restoring a dump

From

David Osborne

Date:

19 March 2018, 19:33:26

That did the trick... thanks!

yes perhaps a minor planner difference just tipped us over the edge previously

=> alter table stock_trans alter column product_id set STATISTICS 1000;

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=3461.10..3461.10 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)

Buffers: shared hit=3

-> Sort (cost=3461.10..3461.75 rows=260 width=4) (actual time=0.013..0.013 rows=0 loops=1)

Sort Key: a.trans_date DESC

Sort Method: quicksort Memory: 25kB

Buffers: shared hit=3

-> Nested Loop (cost=0.87..3459.80 rows=260 width=4) (actual time=0.008..0.008 rows=0 loops=1)

Buffers: shared hit=3

-> Index Scan using stock_trans_product_idx on stock_trans s (cost=0.43..1263.55 rows=260 width=4) (actual time=0.007..0.007 rows=0 loops=1)

Index Cond: (product_id = 2466420)

Filter: (credit_stock_account_id = 3)

Buffers: shared hit=3

-> Index Scan using account_trans_pkey on account_trans a (cost=0.43..8.44 rows=1 width=8) (never executed)

Index Cond: (account_trans_id = s.account_trans_id)

Planning time: 0.255 ms

Execution time: 0.039 ms

(16 rows)

On 19 March 2018 at 16:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Osborne <david@qcode.co.uk> writes:
> Hi, yes I've run "analyse" against the newly restored database. Should that
> be enough?

My apologies, you did say that further down in the original message.
It looks like the core of the problem is the poor rowcount estimation
here:

-> Bitmap Index Scan on stock_trans_product_idx (cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0 loops=1)
Index Cond: (product_id = 2466420)
Buffers: shared hit=3

You might be able to improve that by raising the statistics target
for stock_trans.product_id. I'm not sure why you weren't getting
bitten by the same issue in 9.1; but the cost estimates aren't
that far apart for the two plans, so maybe you were just lucky ...

regards, tom lane

David Osborne

Qcode Software Limited

http://www.qcode.co.uk

T: +44 (0)1463 896484