Re: Specific query taking time to process - Mailing list pgsql-performance

From Fahiz Mohamed
Subject Re: Specific query taking time to process
Date
Msg-id 7cb455c5-dd6a-4f33-bdbe-92033f7986e2@Spark
Whole thread Raw
In response to Re: Specific query taking time to process  (Michael Lewis <mlewis@entrata.com>)
Responses Re: Specific query taking time to process  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
There is a slight different in both instance’s data. Inastanbce 1 contains latest data and instance 2 consists of data which is 3 weeks older than instance 1. 

Following are the number of rows in each table in both instances

Instance 1
alf_node : 96493129 rows
alf_node_properties : 455599288 rows
alf_node_aspects : 150153006 rowsInstance 2
alf_node : 90831396 rows
alf_node_properties : 440792247 rows
alf_node_aspects : 146648241 rows

I hope the above data difference can make a drastic difference. Please correct me if I am wrong.

I checked "pg_stat_user_tables" and autovacuum never run against those tables.

I did execute vacuum manually and I noticed the below in the output

"INFO: vacuuming "public.alf_node_aspects"
INFO: "alf_node_aspects": found 0 removable, 150264654 nonremovable row versions in 812242 pages
DETAIL: 0 dead row versions cannot be removed yet.
CPU 13.53s/33.35u sec elapsed 77.88 sec.


INFO: analyzing "public.alf_node_aspects"
INFO: "alf_node_aspects": scanned 30000 of 812242 pages, containing 5550000 live rows and 0 dead rows; 30000 rows in sample, 150264770 estimated total rows"

Did the vacuum worked fine?

I did "explain( analyze, buffers )" below are the results:

Instance 1: 

 Nested Loop Semi Join (cost=441133.07..1352496.76 rows=1 width=8) (actual time=9054.324..10029.748 rows=22 loops=1)
Buffers: shared hit=1617812
-> Merge Semi Join (cost=441132.50..1352359.19 rows=41 width=16) (actual time=9054.296..10029.547 rows=22 loops=1)
Merge Cond: (node.id = prop.node_id)
Buffers: shared hit=1617701
-> Index Only Scan using idx_alf_node_tqn on alf_node node (cost=441041.93..1340831.48 rows=4593371 width=8) (actual t
ime=4.418..7594.637 rows=40998148 loops=1)
Index Cond: ((type_qname_id = 240) AND (store_id = 6))
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 130
Heap Fetches: 0
Buffers: shared hit=1617681
SubPlan 1
-> Bitmap Heap Scan on alf_node_aspects aspect_1 (cost=3418.63..440585.22 rows=182459 width=8) (actual time=0.
583..2.992 rows=4697 loops=1)
Recheck Cond: (qname_id = 251)
Heap Blocks: exact=1774
Buffers: shared hit=1791
-> Bitmap Index Scan on fk_alf_nasp_qn (cost=0.00..3373.01 rows=182459 width=0) (actual time=0.384..0.38
4 rows=4697 loops=1)
Index Cond: (qname_id = 251)
Buffers: shared hit=17
-> Index Only Scan using idx_alf_nprop_s on alf_node_properties prop (cost=0.70..41.74 rows=852 width=8) (actual time=
0.022..0.037 rows=23 loops=1)
Index Cond: ((qname_id = '242'::bigint) AND (string_value = 'E292432'::text))
Heap Fetches: 0
Buffers: shared hit=20
-> Index Only Scan using alf_node_aspects_pkey on alf_node_aspects aspect (cost=0.57..38.80 rows=14 width=8) (actual time=0.
007..0.007 rows=1 loops=22)
Index Cond: ((node_id = node.id) AND (qname_id = 245))
Heap Fetches: 22
Buffers: shared hit=111
Planning time: 0.621 ms
Execution time: 10029.799 ms
(29 rows)


Instance 2: 

Nested Loop Semi Join (cost=6471.94..173560891.23 rows=2 width=8) (actual time=0.162..0.470 rows=17 loops=1)
Buffers: shared hit=257
-> Nested Loop (cost=6471.37..173560734.51 rows=45 width=16) (actual time=0.154..0.392 rows=17 loops=1)
Buffers: shared hit=172
-> HashAggregate (cost=3508.15..3516.80 rows=865 width=8) (actual time=0.039..0.043 rows=18 loops=1)
Group Key: prop.node_id
Buffers: shared hit=23
-> Index Only Scan using idx_alf_nprop_s on alf_node_properties prop (cost=0.70..3505.99 rows=866 width=8)
(actual time=0.019..0.033 rows=18 loops=1)
Index Cond: ((qname_id = '242'::bigint) AND (string_value = 'E292432'::text))
Heap Fetches: 18
Buffers: shared hit=23
-> Index Scan using alf_node_pkey on alf_node node (cost=2963.22..200644.17 rows=1 width=8) (actual time=0.019..
0.019 rows=1 loops=18)
Index Cond: (id = prop.node_id)
Filter: ((type_qname_id <> 145) AND (store_id = 6) AND (type_qname_id = 240) AND (NOT (SubPlan 1)))
Rows Removed by Filter: 0
Buffers: shared hit=149
SubPlan 1
-> Materialize (cost=2962.65..397913.00 rows=158204 width=8) (actual time=0.002..0.009 rows=85 loops=17)
Buffers: shared hit=59
-> Bitmap Heap Scan on alf_node_aspects aspect_1 (cost=2962.65..396503.98 rows=158204 width=8) (ac
tual time=0.025..0.085 rows=85 loops=1)
Recheck Cond: (qname_id = 251)
Heap Blocks: exact=55
Buffers: shared hit=59
-> Bitmap Index Scan on fk_alf_nasp_qn (cost=0.00..2923.10 rows=158204 width=0) (actual time
=0.019..0.019 rows=87 loops=1)
Index Cond: (qname_id = 251)
Buffers: shared hit=4
-> Index Only Scan using alf_node_aspects_pkey on alf_node_aspects aspect (cost=0.57..34.32 rows=12 width=8) (actual t
ime=0.004..0.004 rows=1 loops=17)
Index Cond: ((node_id = node.id) AND (qname_id = 245))
Heap Fetches: 17
Buffers: shared hit=85
Planning time: 0.642 ms
Execution time: 0.546 ms
(32 rows)

With the vacuum and reindex on those tables manage to reduce the time to 10s from 30s, but still I don't get the exact performance as Instance 2.

work_mem was set to 4mb previously and in instance 1 its set to 160mb and instance 2 still using 4mb as work_mem. 

shared_buffers set to 8GB in both.

Both are on Postgres DB version 9.6.9

Thanks in advance.

Fahiz
On 10 Dec 2019, 20:15 +0000, Michael Lewis <mlewis@entrata.com>, wrote:
On Mon, Dec 9, 2019 at 3:39 PM Fahiz Mohamed <fahiz@netwidz.com> wrote:
I ran "explain analyse" on this query and I got following result. (We have 2 identical DB instances and they consist of same data. Instane 1 took 20+ second to process and instance 2 took less than a second)

Instance 1: (This is used by regular User - More than 600,000 request a day) - The result is same even when there is no user in the server.
 
     -> Bitmap Heap Scan on alf_node node (cost=995009.97..3303978.85 rows=4565737 width=8) (actual time=3304.419..20465.551 rows=41109751 loops=1)
Recheck Cond: ((store_id = 6) AND (type_qname_id = 240))
Rows Removed by Index Recheck: 54239131
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 2816
Heap Blocks: exact=24301 lossy=1875383

Planning time: 0.639 ms
Execution time: 22946.036 ms


That seems like a lot of lossy blocks. As I understand it, that means the system didn't have enough work_mem to fit all the references to the individual rows which perhaps isn't surprising when it estimates it needs 4.5 million rows and ends up with 41 million.

Do both DB instances have the same data? I ask because the two plans are rather different which makes me think that statistics about the data are not very similar. Are both configured the same, particularly for shared_buffers and work_mem, as well as the various planning cost parameters like random_page cost? If you can provide these plans again with explain( analyze, buffers ) this time? Did you check on the last time autovacuum ran in pg_stat_user_tables?


pgsql-performance by date:

Previous
From: Guillaume Lelarge
Date:
Subject: Re: unexpected result for wastedbytes query after vacuum full
Next
From: Michael Lewis
Date:
Subject: Re: Specific query taking time to process