Thread: BUG #15187: When use huge page,there may be a lot of hanged connections with status startup orauthentication
BUG #15187: When use huge page,there may be a lot of hanged connections with status startup orauthentication
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15187 Logged by: Chen huajun Email address: chjischj@163.com PostgreSQL version: 10.2 Operating system: CentOS release 6.5 Description: ◆phenomenon When use huge page, there may be a lot of hanged connections with status startup or authentication state when multiple concurrent query operations are performed. ◆Env hardware:Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz 16C/128G/300GB SSD OS:CentOS release 6.5 x64 PostgreSQL:10.2 ◆How to reproduct? 1) set huge_pages to on in postgresql.conf. 2) run the following script count=0 while [ $count -le 2000 ]; do psql -c 'select count(*) from test20180502;' & count=$((count + 1)) done table define ---------------------------- CREATE TABLE test20180502 ( c1 int not null, c2 character(10000) DEFAULT 'AAAAA', PRIMARY KEY(c1) ); ◆The investigation of the cause 1、there will be a lot of improper connections with status startup or authentication or SELECT state: # ps -eaf | grep postgres |grep SELECT |wc 2 24 204 # ps -eaf | grep postgres |grep authen |wc 410 4920 45097 # ps -eaf | grep postgres |grep startup |wc 78 936 8034 2、According to the stack information (*), it can be seen that the function PGSemaphoreLock has been in a waiting state because of the incorrect SEMA intelligence when performing the following operation. errStatus = sem_wait(PG_SEM_REF(sema)); 3、By looking at the system table pg_stat_activity, I've found that the wait_event_type and wait_event of some query operations are in the following state. postgres=# select pid,wait_event_type,wait_event,state,backend_xid,query from pg_stat_activity; pid | wait_event_type | wait_event | state | backend_xid | query ------+-----------------+---------------------+--------+-------------+-------------------------------------------------------------------------------------- 6333 | Activity | AutoVacuumMain | | | 6337 | Activity | LogicalLauncherMain | | | 6408 | LWLock | buffer_mapping | active | | select count(*) from test20180502; 6414 | LWLock | buffer_mapping | active | | select count(*) from test20180502; 7509 | | | active | | select pid,wait_event_type,wait_event,state,backend_xid,query from pg_stat_activity; 6331 | Activity | BgWriterMain | | | 6330 | Activity | CheckpointerMain | | | 6332 | Activity | WalWriterMain | | | postgres=# select locktype,relation::regclass as table,a.pid,mode,granted,substring(query,1,50) from pg_locks a, pg_stat_activity b where a.pid = b.pid; locktype | table | pid | mode | granted | substring ------------+---------------------------+------+-----------------+---------+---------------------------------------------------- relation | test20180502_pkey | 6408 | AccessShareLock | t | select count(*) from test20180502; relation | test20180502 | 6408 | AccessShareLock | t | select count(*) from test20180502; virtualxid | | 6408 | ExclusiveLock | t | select count(*) from test20180502; relation | test20180502_pkey | 6414 | AccessShareLock | t | select count(*) from test20180502; relation | test20180502 | 6414 | AccessShareLock | t | select count(*) from test20180502; virtualxid | | 6414 | ExclusiveLock | t | select count(*) from test20180502; *When there is a large number of improper connections in the startup or authentication or SELECT state, the stack information is as follows: ◇authentication state #0 0x000000321dc0d930 in sem_wait () from /lib64/libpthread.so.0 #1 0x00000000006a5240 in PGSemaphoreLock () #2 0x00000000007152e5 in LWLockAcquire () #3 0x000000000070fc8f in LockRelease () #4 0x000000000070c323 in UnlockRelationId () #5 0x00000000004ad2d2 in relation_close () #6 0x000000000080ed21 in CatalogCacheInitializeCache () #7 0x0000000000810ee0 in SearchCatCache () #8 0x000000000081dc08 in GetSysCacheOid () #9 0x00000000007391a7 in get_role_oid () #10 0x000000000062d102 in hba_getauthmethod () #11 0x000000000062b729 in ClientAuthentication () #12 0x0000000000833fb3 in InitPostgres () #13 0x0000000000724b52 in PostgresMain () #14 0x00000000006b8c8a in PostmasterMain () #15 0x00000000006395e0 in main () ◇startup state #0 0x000000321dc0d930 in sem_wait () from /lib64/libpthread.so.0 #1 0x00000000006a5240 in PGSemaphoreLock () #2 0x00000000007152e5 in LWLockAcquire () #3 0x0000000000710508 in LockAcquireExtended () #4 0x000000000070cce8 in LockRelationOid () #5 0x00000000004ad405 in relation_open () #6 0x00000000004ad436 in heap_open () #7 0x000000000080ecce in CatalogCacheInitializeCache () #8 0x0000000000810ee0 in SearchCatCache () #9 0x0000000000845efd in superuser_arg () #10 0x0000000000833b17 in InitPostgres () #11 0x0000000000724b52 in PostgresMain () #12 0x00000000006b8c8a in PostmasterMain () #13 0x00000000006395e0 in main () ◇SELECT state #0 0x000000321dc0d930 in sem_wait () from /lib64/libpthread.so.0 #1 0x00000000006a5240 in PGSemaphoreLock () #2 0x00000000007152e5 in LWLockAcquire () #3 0x0000000000710508 in LockAcquireExtended () #4 0x000000000070cce8 in LockRelationOid () #5 0x00000000004ad405 in relation_open () #6 0x00000000004ad436 in heap_open () #7 0x000000000080ecce in CatalogCacheInitializeCache () #8 0x0000000000810ee0 in SearchCatCache () #9 0x000000000081d4e8 in get_tablespace () #10 0x000000000081d5c1 in get_tablespace_page_costs () #11 0x000000000066ba15 in cost_seqscan () #12 0x000000000069cc23 in create_seqscan_path () #13 0x0000000000665423 in set_rel_pathlist () #14 0x0000000000666716 in make_one_rel () #15 0x0000000000682e91 in query_planner () #16 0x0000000000687055 in grouping_planner () #17 0x00000000006887c9 in subquery_planner () #18 0x00000000006895bd in standard_planner () #19 0x00000000007238ca in pg_plan_query () #20 0x000000000072399e in pg_plan_queries () #21 0x000000000072406a in exec_simple_query () #22 0x0000000000724f09 in PostgresMain () #23 0x00000000006b8c8a in PostmasterMain () #24 0x00000000006395e0 in main () ◆The additional occurrence condition a)Depend on the number of table records. The probleam do not happen when table contains few records tablename record result test20180502 10000 not happen 50000 not happen 57500 happen 65000 happen 80000 happen 100000 happen 1000000 happen test20180503 10000 not happen 50000 not happen 57500 not happen 65000 not happen 80000 not happen 100000 happen 1000000 happen table define ---------------------------- CREATE TABLE test20180502 ( c1 int not null, c2 character(10000) DEFAULT 'AAAAA', PRIMARY KEY(c1) ); CREATE TABLE test20180503 ( c1 int not null, PRIMARY KEY(c1) ); b) Depend on OS CentOS release 6.5 (Final) 2.6.32-431.el6.x86_64, happen CentOS Linux release 7.3.1611 (Core) 3.10.0-514.el7.x86_64, not happen c) Depend on huge page huge_page=on, happen(no matter transparent_hugepage is [always] or [never]) huge_page=off, not happen d) Depend on PostgreSQL Version PostgreSQL 10.2 happen PostgreSQL 9.6 not happen
Re: BUG #15187: When use huge page, there may be a lot of hanged connections with status startup or authentication
From
Tom Lane
Date:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes: > Operating system: CentOS release 6.5 > When use huge page, there may be a lot of hanged connections with status > startup or authentication state when multiple concurrent query operations > are performed. So don't do that. RHEL6 kernels are pretty ancient at this point, and are known to have performance issues with huge pages. If you can replicate the problem on a modern kernel, somebody might be motivated to look into it; but not on that platform. In any case it's very unlikely that this is a Postgres problem rather than a kernel problem. regards, tom lane
Re: BUG #15187: When use huge page, there may be a lot of hangedconnections with status startup or authentication
From
Andres Freund
Date:
On 2018-05-04 12:46:40 -0400, Tom Lane wrote: > =?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes: > > Operating system: CentOS release 6.5 > > > When use huge page, there may be a lot of hanged connections with status > > startup or authentication state when multiple concurrent query operations > > are performed. > > So don't do that. RHEL6 kernels are pretty ancient at this point, > and are known to have performance issues with huge pages. They're known to have perf issues with *transparent* huge pages, not huge pages, right? Chen, have you disabled transparent hugepages and zone reclaim? Greetings, Andres Freund
Re:Re: BUG #15187: When use huge page, there may be a lot of hangedconnections with status startup or authentication
From
chenhj
Date:
At 2018-05-04 23:54:34, "Andres Freund" <andres@anarazel.de> wrote: >On 2018-05-04 12:46:40 -0400, Tom Lane wrote: >> PG Bug reporting form <noreply@postgresql.org> writes: >> > Operating system: CentOS release 6.5 >> >> > When use huge page, there may be a lot of hanged connections with status >> > startup or authentication state when multiple concurrent query operations >> > are performed. >> >> So don't do that. RHEL6 kernels are pretty ancient at this point, >> and are known to have performance issues with huge pages. > >They're known to have perf issues with *transparent* huge pages, not >huge pages, right? > >Chen, have you disabled transparent hugepages and zone reclaim? > >Greetings, >>Andres Freundc) Depend on huge page huge_page=on, happen(no matter transparent_hugepage is [always] or [never]) huge_page=off, not happenWhen disable transparent hugepages ,this problem also occurs.Aboud zone reclaim,I will see it later.What I doubt is that this problem does not occurs at PostgreSQL 9.6.2 (I tested 10.2 and 9.6.2 on the same machine)d) Depend on PostgreSQL Version PostgreSQL 10.2 happen PostgreSQL 9.6 not happenChen Huajun
Re:Re:Re: BUG #15187: When use huge page, there may be a lot ofhanged connections with status startup or authentication
From
chenhj
Date:
>> >>Chen, have you disabled transparent hugepages and zone reclaim? >> >>Greetings, >>>>Andres Freund>>c) Depend on huge page >huge_page=on, happen(no matter transparent_hugepage is [always] or [never]) >huge_page=off, not happen>>When disable transparent hugepages ,this problem also occurs.>Aboud zone reclaim,I will see it later.>What I doubt is that this problem does not occurs at PostgreSQL 9.6.2 (I tested 10.2 and 9.6.2 on the same machine)>d) Depend on PostgreSQL Version >PostgreSQL 10.2 happen >PostgreSQL 9.6 not happen>Chen HuajunThe problem occurs whether vm.zone_reclaim_mode is set to 0 or 1. In addition, what needs to be corrected is that even huge_pages=off is problematic. Huge_pages = on SQL execution is a very slow , and with hangd connections in startup and auth state. This is the stack when SQL is executed $ pstack 31175 #0 0x00000036d76e15c3 in __select_nocancel () from /lib64/libc.so.6 #1 0x0000000000862cca in pg_usleep () #2 0x0000000000715b19 in perform_spin_delay () #3 0x0000000000715c18 in s_lock () #4 0x000000000085b157 in GetOldSnapshotThresholdTimestamp () #5 0x00000000006f577e in TestForOldSnapshot_impl () #6 0x00000000004b4b2d in heap_getnext () #7 0x0000000000616e71 in SeqNext () #8 0x00000000005fc677 in ExecScan () #9 0x000000000060454e in fetch_input_tuple () #10 0x00000000006062e9 in ExecAgg () #11 0x00000000005f9a9f in standard_ExecutorRun () #12 0x00007f8fcff5d6f8 in pgss_ExecutorRun () from /usr/pgsql-10/lib/pg_stat_statements.so #13 0x000000000072673b in PortalRunSelect () #14 0x00000000007278d1 in PortalRun () #15 0x0000000000723f51 in exec_simple_query () #16 0x0000000000724f09 in PostgresMain () #17 0x00000000006b8c8a in PostmasterMain () #18 0x00000000006395e0 in main () This is the execution time of the SQL postgres=# select * from pg_stat_statements where query ~ 'test20180503'; -[ RECORD 1 ]-------+---------------------------------- userid | 16388 dbid | 13451 queryid | 1902655528 query | select count(*) from test20180503 calls | 608 total_time | 217341245.157586992 min_time | 291100.481972999987 max_time | 378225.054139000014 mean_time | 357469.153219715168 stddev_time | 24984.333161207207 rows | 608 shared_blks_hit | 2689946 shared_blks_read | 2278 shared_blks_dirtied | 4425 shared_blks_written | 0 local_blks_hit | 0 local_blks_read | 0 local_blks_dirtied | 0 local_blks_written | 0 temp_blks_read | 0 temp_blks_written | 0 blk_read_time | 33.5047190000000015 blk_write_time | 0 While huge_pages=off thres are no connections hanged in startup or auth states, but SQL execution is also very slow for several hundred seconds, during which time the CPU utilization rate is 100%. This is the SQL execution time when huge_pages=off postgres=# select * from pg_stat_statements where query ~ 'test20180503'; -[ RECORD 1 ]-------+---------------------------------- userid | 16388 dbid | 13451 queryid | 1902655528 query | select count(*) from test20180503 calls | 1001 total_time | 650132764.661145091 min_time | 600196.529144000029 max_time | 672296.767421000055 mean_time | 649483.281379765598 stddev_time | 19423.3311969555834 rows | 1001 shared_blks_hit | 4432428 shared_blks_read | 0 shared_blks_dirtied | 4425 shared_blks_written | 0 local_blks_hit | 0 local_blks_read | 0 local_blks_dirtied | 0 local_blks_written | 0 temp_blks_read | 0 temp_blks_written | 0 blk_read_time | 0 blk_write_time | 0 While On PostgreSQL9.6@CentOS6.5 and PostgreSQL10.2@CentOS7.3. The same way to execute SQL, only takes 2 seconds. postgres=# select * from pg_stat_statements where query ~ 'test20180503'; -[ RECORD 1 ]-------+----------------------------------- userid | 16389 dbid | 13269 queryid | 3703798644 query | select count(*) from test20180503; calls | 1001 total_time | 2751692.53700000001 min_time | 1108.32999999999993 max_time | 4031.02199999999993 mean_time | 2748.94359340659184 stddev_time | 642.984323244542566 rows | 1001 shared_blks_hit | 4428003 shared_blks_read | 4425 shared_blks_dirtied | 0 shared_blks_written | 0 local_blks_hit | 0 local_blks_read | 0 local_blks_dirtied | 0 local_blks_written | 0 temp_blks_read | 0 temp_blks_written | 0 blk_read_time | 90.8639999999999759 blk_write_time | 0-------Regards,Chen Huajun
Re: BUG #15187: When use huge page, there may be a lot of hangedconnections with status startup or authentication
From
Andres Freund
Date:
On 2018-05-06 23:45:17 +0800, chenhj wrote: > >> > >>Chen, have you disabled transparent hugepages and zone reclaim? > >> > >>Greetings, > >> > >>Andres Freund>>c) Depend on huge page >huge_page=on, happen(no matter transparent_hugepage is [always] or [never]) >huge_page=off,not happen > > > >When disable transparent hugepages ,this problem also occurs. > >Aboud zone reclaim,I will see it later. > >What I doubt is that this problem does not occurs at PostgreSQL 9.6.2 (I tested 10.2 and 9.6.2 on the same machine) > >d) Depend on PostgreSQL Version > >PostgreSQL 10.2 happen > >PostgreSQL 9.6 not happen > >Chen Huajun > The problem occurs whether vm.zone_reclaim_mode is set to 0 or 1. > > In addition, what needs to be corrected is that even huge_pages=off is problematic. > > Huge_pages = on SQL execution is a very slow , and with hangd connections in startup and auth state. > You'd probably need to provide a few perf profiles to get further insight. Greetings, Andres Freund
Re:Re: BUG #15187: When use huge page, there may be a lot of hangedconnections with status startup or authentication
From
chenhj
Date:
At 2018-05-07 02:57:12, "Andres Freund" <andres@anarazel.de> wrote: >On 2018-05-06 23:45:17 +0800, chenhj wrote: >> >> >> >>Chen, have you disabled transparent hugepages and zone reclaim? >> >> >> >>Greetings, >> >> >> >>Andres Freund>>c) Depend on huge page >huge_page=on, happen(no matter transparent_hugepage is [always] or [never]) >huge_page=off, not happen >> > >> >When disable transparent hugepages ,this problem also occurs. >> >Aboud zone reclaim,I will see it later. >> >What I doubt is that this problem does not occurs at PostgreSQL 9.6.2 (I tested 10.2 and 9.6.2 on the same machine) >> >d) Depend on PostgreSQL Version >> >PostgreSQL 10.2 happen >> >PostgreSQL 9.6 not happen >> >Chen Huajun >> The problem occurs whether vm.zone_reclaim_mode is set to 0 or 1. >> >> In addition, what needs to be corrected is that even huge_pages=off is problematic. >> >> Huge_pages = on SQL execution is a very slow , and with hangd connections in startup and auth state. >> > >You'd probably need to provide a few perf profiles to get further >insight. > >Greetings, >>Andres FreundAccording to test, this question is related to commit "ecb0d20a9d2e09b7112d3b192047f711f9ff7e59", which changed from Using SysV semaphores to Using POSIX semaphores on Linux. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ecb0d20a9d2e09b7112d3b192047f711f9ff7e59 Use unnamed POSIX semaphores, if available, on Linux and FreeBSD. We've had support for using unnamed POSIX semaphores instead of System Vsemaphores for quite some time, but it was not used by default on anyplatform. Since many systems have rather small limits on the number ofSysV semaphores allowed, it seems desirable to switch to POSIX semaphoreswhere they're available and don't create performance or kernel resourceproblems. Experimentation by me shows that unnamed POSIX semaphoresare at least as good as SysV semaphores on Linux, and we previously hada report from Maksym Sobolyev that FreeBSD is significantly worse withSysV semaphores than POSIX ones. So adjust those two platforms to useunnamed POSIX semaphores, if configure can find the necessary libraryfunctions. If this goes well, we may switch other platforms as well,but it would be advisable to test them individually first. It's not currently contemplated that we'd encourage users to selecta semaphore API for themselves, but anyone who wants to experimentcan add PREFERRED_SEMAPHORES=UNNAMED_POSIX (or NAMED_POSIX, or SYSV)to their configure command line to do so. I also tweaked configure to report which API it's selected, mainlyso that we can tell that from buildfarm reports. I did not touch the user documentation's discussion about semaphores;that will need some adjustment once the dust settles. Discussion: <8536.1475704230@sss.pgh.pa.us> This is why, this problem does not occur on 9.6.2, and it occurs on 10.2. As to why? Perhaps this is a bug in the Linux kernel. However, it is not clear from which version of the Linux kernel "fixed?" this problem. The problem still occurs after upgrading the CentOS 6.5 kernel from 2.6.32-431 to 2.6.32-504.23.4.To avoid this problem, may be the only way is upgrading the CentOS to higher version(such as CentOS 7.3).Regards,Chen Huajun
Re:Re:Re: BUG #15187: When use huge page, there may be a lot ofhanged connections with status startup or authentication
From
chenhj
Date:
At 2018-05-15 06:16:39, "chenhj" <chjischj@163.com> wrote:
We have confirmed this to be a known Linux kernel bug. And fixed by the following commmit. Thanks for all help.At 2018-05-07 02:57:12, "Andres Freund" <andres@anarazel.de> wrote: >On 2018-05-06 23:45:17 +0800, chenhj wrote: >> >> >> >>Chen, have you disabled transparent hugepages and zone reclaim? >> >> >> >>Greetings, >> >> >> >>Andres Freund>>c) Depend on huge page >huge_page=on, happen(no matter transparent_hugepage is [always] or [never]) >huge_page=off, not happen >> > >> >When disable transparent hugepages ,this problem also occurs. >> >Aboud zone reclaim,I will see it later. >> >What I doubt is that this problem does not occurs at PostgreSQL 9.6.2 (I tested 10.2 and 9.6.2 on the same machine) >> >d) Depend on PostgreSQL Version >> >PostgreSQL 10.2 happen >> >PostgreSQL 9.6 not happen >> >Chen Huajun >> The problem occurs whether vm.zone_reclaim_mode is set to 0 or 1. >> >> In addition, what needs to be corrected is that even huge_pages=off is problematic. >> >> Huge_pages = on SQL execution is a very slow , and with hangd connections in startup and auth state. >> > >You'd probably need to provide a few perf profiles to get further >insight. > >Greetings, >>Andres FreundAccording to test, this question is related to commit "ecb0d20a9d2e09b7112d3b192047f711f9ff7e59", which changed from Using SysV semaphores to Using POSIX semaphores on Linux. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=ecb0d20a9d2e09b7112d3b192047f711f9ff7e59 Use unnamed POSIX semaphores, if available, on Linux and FreeBSD. We've had support for using unnamed POSIX semaphores instead of System Vsemaphores for quite some time, but it was not used by default on anyplatform. Since many systems have rather small limits on the number ofSysV semaphores allowed, it seems desirable to switch to POSIX semaphoreswhere they're available and don't create performance or kernel resourceproblems. Experimentation by me shows that unnamed POSIX semaphoresare at least as good as SysV semaphores on Linux, and we previously hada report from Maksym Sobolyev that FreeBSD is significantly worse withSysV semaphores than POSIX ones. So adjust those two platforms to useunnamed POSIX semaphores, if configure can find the necessary libraryfunctions. If this goes well, we may switch other platforms as well,but it would be advisable to test them individually first. It's not currently contemplated that we'd encourage users to selecta semaphore API for themselves, but anyone who wants to experimentcan add PREFERRED_SEMAPHORES=UNNAMED_POSIX (or NAMED_POSIX, or SYSV)to their configure command line to do so. I also tweaked configure to report which API it's selected, mainlyso that we can tell that from buildfarm reports. I did not touch the user documentation's discussion about semaphores;that will need some adjustment once the dust settles. Discussion: <8536.1475704230@sss.pgh.pa.us> This is why, this problem does not occur on 9.6.2, and it occurs on 10.2. As to why? Perhaps this is a bug in the Linux kernel. However, it is not clear from which version of the Linux kernel "fixed?" this problem. The problem still occurs after upgrading the CentOS 6.5 kernel from 2.6.32-431 to 2.6.32-504.23.4.To avoid this problem, may be the only way is upgrading the CentOS to higher version(such as CentOS 7.3).Regards,
Chen Huajun
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?id=13d60f4b6ab5b702dc8d2ee20999f98a93728aec
futex: Take hugepages into account when generating futex_key
Regards,
Chen Huajun