Hi PostgreSQL community,
I have a system that was running version 9.6.17 running on a system with
48gb of memory and spinning disks front-ed by a HW RAID controller with
NVRAM cache. We moved to a new box running version 12.3 on a system with
64gb of memory and NVME SSD drives. Here are the system config options:
OLD:
shared_buffers = 2048MB # min 128kB
work_mem = 128MB # min 64kB
maintenance_work_mem = 1024MB # min 1MB
effective_io_concurrency = 8 # 1-1000; 0 disables prefetching
max_parallel_workers_per_gather = 0 # taken from max_worker_processes
effective_cache_size = 24GB
default_statistics_target = 500 # range 1-10000
from_collapse_limit = 30
join_collapse_limit = 30 # 1 disables collapsing of explicit
seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 4.0 # same scale as above
NEW:
shared_buffers = 12GB # min 128kB
work_mem = 128MB # min 64kB
maintenance_work_mem = 2GB # min 1MB
effective_io_concurrency = 200 # 1-1000; 0 disables prefetching
max_worker_processes = 24 # (change requires restart)
max_parallel_workers_per_gather = 4 # taken from max_parallel_workers
max_parallel_workers = 24 # maximum number of max_worker_processes that
seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 1.1 # same scale as above for SSDs
effective_cache_size = 36GB
default_statistics_target = 500 # range 1-10000
from_collapse_limit = 30
join_collapse_limit = 30 # 1 disables collapsing of explicit
As far as the schema goes, it uses an id field populated by a sequence as
the primary key for everything. Here are the definitions for the tables
involved in the query:
Table "public.users"
Column | Type | Modifiers
---------------------+-----------------------------+------------------------------------------------------------
id | integer | not null default nextval(('users_id_seq'::text)::regclass)
name | character varying(200) | not null
password | character varying(256) |
comments | text |
signature | text |
emailaddress | character varying(120) |
freeformcontactinfo | text |
organization | character varying(200) |
realname | character varying(120) |
nickname | character varying(16) |
lang | character varying(16) |
gecos | character varying(16) |
homephone | character varying(30) |
workphone | character varying(30) |
mobilephone | character varying(30) |
pagerphone | character varying(30) |
address1 | character varying(200) |
address2 | character varying(200) |
city | character varying(100) |
state | character varying(100) |
zip | character varying(16) |
country | character varying(50) |
timezone | character varying(50) |
creator | integer | not null default 0
created | timestamp without time zone |
lastupdatedby | integer | not null default 0
lastupdated | timestamp without time zone |
authtoken | character varying(16) |
smimecertificate | text |
Indexes:
"users_pkey" PRIMARY KEY, btree (id) CLUSTER
"users1" UNIQUE, btree (lower(name::text))
"users2" btree (lower(emailaddress::text))
"users_email_trgm" gin (emailaddress gin_trgm_ops)
Table "public.principals"
Column | Type | Modifiers
---------------+-----------------------+-----------------------------------------------------------------
id | integer | not null default nextval(('principals_id_seq'::text)::regclass)
principaltype | character varying(16) | not null
disabled | smallint | not null default 0
Indexes:
"principals_pkey" PRIMARY KEY, btree (id) CLUSTER
Table "public.cachedgroupmembers"
Column | Type | Modifiers
-------------------+----------+-------------------------------------------------------------------------
id | integer | not null default nextval(('cachedgroupmembers_id_seq'::text)::regclass)
groupid | integer |
memberid | integer |
via | integer |
immediateparentid | integer |
disabled | smallint | not null default 0
Indexes:
"cachedgroupmembers_pkey" PRIMARY KEY, btree (id)
"cachedgroupmembers1" btree (memberid, immediateparentid)
"cachedgroupmembers4" btree (memberid, groupid, disabled)
"disgroumem" btree (groupid, memberid, disabled)
"shredder_cgm2" btree (immediateparentid, memberid)
"shredder_cgm3" btree (via, id)
Table "public.acl"
Column | Type | Modifiers
---------------+-----------------------------+----------------------------------------------------------
id | integer | not null default nextval(('acl_id_seq'::text)::regclass)
principaltype | character varying(25) | not null
principalid | integer | not null
rightname | character varying(25) | not null
objecttype | character varying(25) | not null
objectid | integer | not null default 0
creator | integer | not null default 0
created | timestamp without time zone |
lastupdatedby | integer | not null default 0
lastupdated | timestamp without time zone |
Indexes:
"acl_pkey" PRIMARY KEY, btree (id)
"acl1" btree (rightname, objecttype, objectid, principaltype, principalid) CLUSTER
All of the tables have been analyzed and frozen. It looks like a problem with using
a nested loop based on poor estimates. If I disable nested loops, the query only takes
2s and not the 69s with them enabled. Of course, both of those are a far cry from the
0.025s on the old system. I know that the old system is chosing the plan based on
statistics but at least the times were okay. Is there anyway to provide the system
with the statistics to make a better choice on the new system? Here are the EXPLAIN
ANALYZE results for the old system and two for the new system, one with and one
without nested loops:
OLD:
EXPLAIN (ANALYZE, BUFFERS) SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_3 JOIN Principals Principals_1 ON
(Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.MemberId =
Principals_1.id) JOIN CachedGroupMembers CachedGroupMembers_4 ON ( CachedGroupMembers_4.MemberId = Principals_1.id )
WHERE((ACL_3.ObjectType = 'RT::Ticket' AND ACL_3.ObjectId = 950423) OR (ACL_3.ObjectType = 'RT::Queue' AND
ACL_3.ObjectId = 1) OR (ACL_3.ObjectType = 'RT::System' AND ACL_3.ObjectId = 1)) AND (ACL_3.PrincipalId =
CachedGroupMembers_4.GroupId)AND (ACL_3.PrincipalType = 'Group') AND (ACL_3.RightName = 'OwnTicket') AND
(CachedGroupMembers_2.Disabled= '0') AND (CachedGroupMembers_2.GroupId = '4') AND (CachedGroupMembers_4.Disabled = '0')
AND(Principals_1.Disabled = '0') AND (Principals_1.PrincipalType = 'User') AND (Principals_1.id != '1') ORDER BY
main.NameASC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------
Unique (cost=4190.59..4190.66 rows=1 width=1268) (actual time=18.279..18.864 rows=324 loops=1)
Buffers: shared hit=5389
-> Sort (cost=4190.59..4190.59 rows=1 width=1268) (actual time=18.279..18.354 rows=560 loops=1)
Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress,
main.freeformcontactinfo,main.organization, main.realname, main.nickname, main.lang, main.gecos, main.home
phone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state,
main.zip,main.country, main.timezone, main.creator, main.created, main.lastupdatedby, main.lastupda
ted, main.authtoken, main.smimecertificate
Sort Method: quicksort Memory: 238kB
Buffers: shared hit=5373
-> Nested Loop (cost=3889.42..4190.58 rows=1 width=1268) (actual time=7.653..15.122 rows=560 loops=1)
Join Filter: (main.id = principals_1.id)
Buffers: shared hit=5329
-> Hash Join (cost=3888.99..4175.39 rows=31 width=1276) (actual time=7.643..9.681 rows=560 loops=1)
Hash Cond: (cachedgroupmembers_4.memberid = main.id)
Buffers: shared hit=3086
-> Nested Loop (cost=0.72..117.66 rows=45103 width=4) (actual time=0.102..1.693 rows=674
loops=1)
Buffers: shared hit=615
-> Index Only Scan using acl1 on acl acl_3 (cost=0.29..53.93 rows=14 width=4) (actual
time=0.054..0.427rows=3 loops=1)
Index Cond: ((rightname = 'OwnTicket'::text) AND (principaltype = 'Group'::text))
Filter: ((((objecttype)::text = 'RT::Ticket'::text) AND (objectid = 950423)) OR
(((objecttype)::text= 'RT::Queue'::text) AND (objectid = 1)) OR (((objecttype)::text = 'RT::Syste
m'::text) AND (objectid = 1)))
Rows Removed by Filter: 487
Heap Fetches: 126
Buffers: shared hit=37
-> Index Only Scan using disgroumem on cachedgroupmembers cachedgroupmembers_4
(cost=0.43..4.51rows=4 width=8) (actual time=0.024..0.382 rows=225 loops=3)
Index Cond: ((groupid = acl_3.principalid) AND (disabled = '0'::smallint))
Heap Fetches: 446
Buffers: shared hit=578
-> Hash (cost=3885.58..3885.58 rows=216 width=1272) (actual time=7.526..7.526 rows=520 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 144kB
Buffers: shared hit=2471
-> Nested Loop (cost=0.85..3885.58 rows=216 width=1272) (actual time=0.041..6.940 rows=520
loops=1)
Buffers: shared hit=2471
-> Index Only Scan using disgroumem on cachedgroupmembers cachedgroupmembers_2
(cost=0.43..23.49rows=553 width=4) (actual time=0.018..1.164 rows=522 loops=1)
Index Cond: ((groupid = 4) AND (disabled = '0'::smallint))
Heap Fetches: 276
Buffers: shared hit=383
-> Index Scan using users_pkey on users main (cost=0.42..6.97 rows=1 width=1268)
(actualtime=0.010..0.010 rows=1 loops=522)
Index Cond: (id = cachedgroupmembers_2.memberid)
Buffers: shared hit=2088
-> Index Scan using principals_pkey on principals principals_1 (cost=0.43..0.48 rows=1 width=4)
(actualtime=0.009..0.009 rows=1 loops=560)
Index Cond: (id = cachedgroupmembers_4.memberid)
Filter: ((id <> 1) AND (disabled = '0'::smallint) AND ((principaltype)::text = 'User'::text))
Buffers: shared hit=2243
Planning time: 5.409 ms
Execution time: 19.080 ms
(42 rows)
NEW:
EXPLAIN (ANALYZE, BUFFERS) SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_3 JOIN Principals Principals_1 ON
(Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.MemberId =
Principals_1.id) JOIN CachedGroupMembers CachedGroupMembers_4 ON ( CachedGroupMembers_4.MemberId = Principals_1.id )
WHERE((ACL_3.ObjectType = 'RT::Ticket' AND ACL_3.ObjectId = 950423) OR (ACL_3.ObjectType = 'RT::Queue' AND
ACL_3.ObjectId = 1) OR (ACL_3.ObjectType = 'RT::System' AND ACL_3.ObjectId = 1)) AND (ACL_3.PrincipalId =
CachedGroupMembers_4.GroupId)AND (ACL_3.PrincipalType = 'Group') AND (ACL_3.RightName = 'OwnTicket') AND
(CachedGroupMembers_2.Disabled= '0') AND (CachedGroupMembers_2.GroupId = '4') AND (CachedGroupMembers_4.Disabled = '0')
AND(Principals_1.Disabled = '0') AND (Principals_1.PrincipalType = 'User') AND (Principals_1.id != '1') ORDER BY
main.NameASC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------
Unique (cost=1276.71..1276.78 rows=1 width=1298) (actual time=69483.412..69483.990 rows=324 loops=1)
Buffers: shared hit=5327437 dirtied=2
-> Sort (cost=1276.71..1276.71 rows=1 width=1298) (actual time=69483.409..69483.449 rows=560 loops=1)
Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress,
main.freeformcontactinfo,main.organization, main.realname, main.nickname, main.lang, main.gecos, main.home
phone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state,
main.zip,main.country, main.timezone, main.creator, main.created, main.lastupdatedby, main.lastupda
ted, main.authtoken, main.smimecertificate
Sort Method: quicksort Memory: 238kB
Buffers: shared hit=5327421 dirtied=2
-> Nested Loop (cost=2.00..1276.70 rows=1 width=1298) (actual time=0.458..69480.206 rows=560 loops=1)
Buffers: shared hit=5327405 dirtied=2
-> Nested Loop (cost=1.71..1263.36 rows=2 width=1302) (actual time=0.075..413.525 rows=886318
loops=1)
Buffers: shared hit=9496 dirtied=2
-> Nested Loop (cost=1.28..1262.07 rows=1 width=1306) (actual time=0.053..10.123 rows=519
loops=1)
Buffers: shared hit=4179
-> Nested Loop (cost=0.85..1108.38 rows=208 width=1302) (actual time=0.043..5.135 rows=520
loops=1)
Buffers: shared hit=2099
-> Index Only Scan using disgroumem on cachedgroupmembers cachedgroupmembers_2
(cost=0.43..15.43rows=530 width=4) (actual time=0.020..0.258 rows=522 loops=1)
Index Cond: ((groupid = 4) AND (disabled = '0'::smallint))
Heap Fetches: 7
Buffers: shared hit=13
-> Index Scan using users_pkey on users main (cost=0.42..2.06 rows=1 width=1298)
(actualtime=0.008..0.008 rows=1 loops=522)
Index Cond: (id = cachedgroupmembers_2.memberid)
Buffers: shared hit=2086
-> Index Scan using principals_pkey on principals principals_1 (cost=0.43..0.74 rows=1
width=4)(actual time=0.009..0.009 rows=1 loops=520)
Index Cond: (id = main.id)
Filter: ((id <> 1) AND (disabled = '0'::smallint) AND ((principaltype)::text =
'User'::text))
Rows Removed by Filter: 0
Buffers: shared hit=2080
-> Index Only Scan using cachedgroupmembers4 on cachedgroupmembers cachedgroupmembers_4
(cost=0.43..1.08rows=21 width=8) (actual time=0.010..0.384 rows=1708 loops=519)
Index Cond: ((memberid = principals_1.id) AND (disabled = '0'::smallint))
Heap Fetches: 2309
Buffers: shared hit=5317 dirtied=2
-> Index Only Scan using acl1 on acl acl_3 (cost=0.29..6.66 rows=1 width=4) (actual time=0.078..0.078
rows=0loops=886318)
Index Cond: ((rightname = 'OwnTicket'::text) AND (principaltype = 'Group'::text) AND (principalid
=cachedgroupmembers_4.groupid))
Filter: ((((objecttype)::text = 'RT::Ticket'::text) AND (objectid = 950423)) OR
(((objecttype)::text= 'RT::Queue'::text) AND (objectid = 1)) OR (((objecttype)::text = 'RT::System'::text) AN
D (objectid = 1)))
Rows Removed by Filter: 0
Heap Fetches: 0
Buffers: shared hit=5317909
Planning Time: 3.099 ms
Execution Time: 69484.104 ms
(38 rows)
Time: 69488.511 ms (01:09.489)
NEW (no nested):
EXPLAIN (ANALYZE, BUFFERS) SELECT DISTINCT main.* FROM Users main CROSS JOIN ACL ACL_3 JOIN Principals Principals_1 ON
(Principals_1.id = main.id ) JOIN CachedGroupMembers CachedGroupMembers_2 ON ( CachedGroupMembers_2.MemberId =
Principals_1.id) JOIN CachedGroupMembers CachedGroupMembers_4 ON ( CachedGroupMembers_4.MemberId = Principals_1.id )
WHERE((ACL_3.ObjectType = 'RT::Ticket' AND ACL_3.ObjectId = 950423) OR (ACL_3.ObjectType = 'RT::Queue' AND
ACL_3.ObjectId = 1) OR (ACL_3.ObjectType = 'RT::System' AND ACL_3.ObjectId = 1)) AND (ACL_3.PrincipalId =
CachedGroupMembers_4.GroupId)AND (ACL_3.PrincipalType = 'Group') AND (ACL_3.RightName = 'OwnTicket') AND
(CachedGroupMembers_2.Disabled= '0') AND (CachedGroupMembers_2.GroupId = '4') AND (CachedGroupMembers_4.Disabled = '0')
AND(Principals_1.Disabled = '0') AND (Principals_1.PrincipalType = 'User') AND (Principals_1.id != '1') ORDER BY
main.NameASC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Unique (cost=117008.39..117008.47 rows=1 width=1298) (actual time=2334.366..2334.913 rows=324 loops=1)
Buffers: shared hit=66381 dirtied=3
-> Sort (cost=117008.39..117008.39 rows=1 width=1298) (actual time=2334.364..2334.398 rows=560 loops=1)
Sort Key: main.name, main.id, main.password, main.comments, main.signature, main.emailaddress,
main.freeformcontactinfo,main.organization, main.realname, main.nickname, main.lang, main.gecos, main.home
phone, main.workphone, main.mobilephone, main.pagerphone, main.address1, main.address2, main.city, main.state,
main.zip,main.country, main.timezone, main.creator, main.created, main.lastupdatedby, main.lastupda
ted, main.authtoken, main.smimecertificate
Sort Method: quicksort Memory: 238kB
Buffers: shared hit=66365 dirtied=3
-> Hash Join (cost=113207.91..117008.38 rows=1 width=1298) (actual time=1943.567..2331.572 rows=560
loops=1)
Hash Cond: (principals_1.id = cachedgroupmembers_2.memberid)
Buffers: shared hit=66319 dirtied=3
-> Gather (cost=113185.86..116953.73 rows=49 width=1306) (actual time=1903.765..2323.358 rows=564
loops=1)
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=66306 dirtied=3
-> Hash Join (cost=112185.86..115948.83 rows=29 width=1306) (actual time=1894.728..2274.198
rows=282loops=2)
Hash Cond: (cachedgroupmembers_4.groupid = acl_3.principalid)
Buffers: shared hit=66306 dirtied=3
-> Parallel Hash Join (cost=112165.88..115850.62 rows=3897 width=1310) (actual
time=1879.642..2158.184rows=1294258 loops=2)
Hash Cond: (main.id = principals_1.id)
Buffers: shared hit=66262 dirtied=3
-> Parallel Seq Scan on users main (cost=0.00..3399.00 rows=73600 width=1298)
(actualtime=0.014..8.917 rows=62564 loops=2)
Buffers: shared hit=2663
-> Parallel Hash (cost=111510.39..111510.39 rows=52439 width=12) (actual
time=1878.946..1878.946rows=1294262 loops=2)
Buckets: 4194304 (originally 262144) Batches: 1 (originally 1) Memory Usage:
184960kB
Buffers: shared hit=63599 dirtied=3
-> Parallel Hash Join (cost=44295.31..111510.39 rows=52439 width=12) (actual
time=232.801..1399.686rows=1294262 loops=2)
Hash Cond: (cachedgroupmembers_4.memberid = principals_1.id)
Buffers: shared hit=63599 dirtied=3
-> Parallel Seq Scan on cachedgroupmembers cachedgroupmembers_4
(cost=0.00..62869.68rows=1655392 width=8) (actual time=0.023..557.151 rows=3309488 loops=2)
Filter: (disabled = '0'::smallint)
Rows Removed by Filter: 26
Buffers: shared hit=42177
-> Parallel Hash (cost=43789.21..43789.21 rows=40488 width=4) (actual
time=231.914..231.914rows=61984 loops=2)
Buckets: 131072 Batches: 1 Memory Usage: 5920kB
Buffers: shared hit=21422 dirtied=3
-> Parallel Seq Scan on principals principals_1
(cost=0.00..43789.21rows=40488 width=4) (actual time=0.021..212.001 rows=61984 loops=2)
Filter: ((id <> 1) AND (disabled = '0'::smallint) AND
((principaltype)::text= 'User'::text))
Rows Removed by Filter: 1919263
Buffers: shared hit=21422 dirtied=3
-> Hash (cost=19.80..19.80 rows=14 width=4) (actual time=14.786..14.786 rows=3 loops=2)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=17
-> Bitmap Heap Scan on acl acl_3 (cost=4.41..19.80 rows=14 width=4) (actual
time=14.766..14.778rows=3 loops=2)
Recheck Cond: ((((rightname)::text = 'OwnTicket'::text) AND ((objecttype)::text
='RT::Ticket'::text) AND (objectid = 950423) AND ((principaltype)::text = 'Group'::text)) O
R (((rightname)::text = 'OwnTicket'::text) AND ((objecttype)::text = 'RT::Queue'::text) AND (objectid = 1) AND
((principaltype)::text= 'Group'::text)) OR (((rightname)::text = 'OwnTicket'::text) AND ((objecttyp
e)::text = 'RT::System'::text) AND (objectid = 1) AND ((principaltype)::text = 'Group'::text)))
Heap Blocks: exact=2
Buffers: shared hit=17
-> BitmapOr (cost=4.41..4.41 rows=14 width=0) (actual time=0.072..0.072 rows=0
loops=2)
Buffers: shared hit=13
-> Bitmap Index Scan on acl1 (cost=0.00..1.40 rows=1 width=0) (actual
time=0.044..0.044rows=0 loops=2)
Index Cond: (((rightname)::text = 'OwnTicket'::text) AND
((objecttype)::text= 'RT::Ticket'::text) AND (objectid = 950423) AND ((principaltype)::text = 'Group':
:text))
Buffers: shared hit=5
-> Bitmap Index Scan on acl1 (cost=0.00..1.59 rows=14 width=0) (actual
time=0.016..0.016rows=2 loops=2)
Index Cond: (((rightname)::text = 'OwnTicket'::text) AND
((objecttype)::text= 'RT::Queue'::text) AND (objectid = 1) AND ((principaltype)::text = 'Group'::text)
)
Buffers: shared hit=4
-> Bitmap Index Scan on acl1 (cost=0.00..1.40 rows=1 width=0) (actual
time=0.009..0.010rows=1 loops=2)
Index Cond: (((rightname)::text = 'OwnTicket'::text) AND
((objecttype)::text= 'RT::System'::text) AND (objectid = 1) AND ((principaltype)::text = 'Group'::text
))
Buffers: shared hit=4
-> Hash (cost=15.43..15.43 rows=530 width=4) (actual time=39.769..39.769 rows=522 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 27kB
Buffers: shared hit=13
-> Index Only Scan using disgroumem on cachedgroupmembers cachedgroupmembers_2 (cost=0.43..15.43
rows=530width=4) (actual time=39.504..39.670 rows=522 loops=1)
Index Cond: ((groupid = 4) AND (disabled = '0'::smallint))
Heap Fetches: 7
Buffers: shared hit=13
Planning Time: 5.112 ms
JIT:
Functions: 74
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 10.510 ms, Inlining 0.000 ms, Optimization 2.889 ms, Emission 65.088 ms, Total 78.487 ms
Execution Time: 2383.883 ms
(69 rows)
Time: 2391.552 ms (00:02.392)
Any suggestions? I have a workaround to avoid the problem query but it loses some functionality.
Regards,
Ken