BUG #17743: Bad RIP VALUE - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17743: Bad RIP VALUE
Date
Msg-id 17743-7a91f7932af59965@postgresql.org
Whole thread Raw
Responses Re: BUG #17743: Bad RIP VALUE  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17743
Logged by:          Giulio Ferrari
Email address:      giulio.ferrari@cambieri.it
PostgreSQL version: 12.9
Operating system:   Ubuntu 20.04.4 LTS
Description:

On the installation in object I am experiencing this error:

[Wed Jan  4 13:53:20 2023] INFO: task postgres:1796 blocked for more than
241 seconds.
[Wed Jan  4 13:53:20 2023]       Tainted: G           OE
5.4.0-107-generic #121-Ubuntu
[Wed Jan  4 13:53:20 2023] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Wed Jan  4 13:53:20 2023] postgres        D    0  1796    787 0x00004000
[Wed Jan  4 13:53:20 2023] Call Trace:
[Wed Jan  4 13:53:20 2023]  __schedule+0x2e3/0x740
[Wed Jan  4 13:53:20 2023]  ? __wake_up_common_lock+0x8a/0xc0
[Wed Jan  4 13:53:20 2023]  schedule+0x42/0xb0
[Wed Jan  4 13:53:20 2023]  jbd2_log_wait_commit+0xaf/0x120
[Wed Jan  4 13:53:20 2023]  ? __wake_up_pollfree+0x40/0x40
[Wed Jan  4 13:53:20 2023]  jbd2_complete_transaction+0x5c/0x90
[Wed Jan  4 13:53:20 2023]  ext4_sync_file+0x358/0x3b0
[Wed Jan  4 13:53:20 2023]  vfs_fsync_range+0x49/0x80
[Wed Jan  4 13:53:20 2023]  do_fsync+0x3d/0x70
[Wed Jan  4 13:53:20 2023]  __x64_sys_fsync+0x14/0x20
[Wed Jan  4 13:53:20 2023]  do_syscall_64+0x57/0x190
[Wed Jan  4 13:53:20 2023]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Wed Jan  4 13:53:20 2023] RIP: 0033:0x7f22551c3917
[Wed Jan  4 13:53:20 2023] Code: Bad RIP value.
[Wed Jan  4 13:53:20 2023] RSP: 002b:00007fff0998f278 EFLAGS: 00000246
ORIG_RAX: 000000000000004a
[Wed Jan  4 13:53:20 2023] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
00007f22551c3917
[Wed Jan  4 13:53:20 2023] RDX: 0000000000000002 RSI: 000055a0a97af640 RDI:
000000000000015b
[Wed Jan  4 13:53:20 2023] RBP: 00007fff0998f2c0 R08: 0000000000000001 R09:
0000000000000001
[Wed Jan  4 13:53:20 2023] R10: 0000000000000000 R11: 0000000000000246 R12:
000000000000015b
[Wed Jan  4 13:53:20 2023] R13: 000055a0a97af640 R14: 0000000000000000 R15:
0000000000000016
[Wed Jan  4 23:10:02 2023] session_init(service_process,507875): OK.
kdev=8:5, bs=4096.
[Wed Jan  4 23:10:02 2023] register_make_request(service_process,507875):
OK. kdev=8:5, mq=0.
[Thu Jan  5 23:10:26 2023] session_init(service_process,2838404): OK.
kdev=8:5, bs=4096.
[Thu Jan  5 23:10:26 2023] register_make_request(service_process,2838404):
OK. kdev=8:5, mq=0.
[Fri Jan  6 23:09:59 2023] session_init(service_process,649733): OK.
kdev=8:5, bs=4096.
[Fri Jan  6 23:09:59 2023] register_make_request(service_process,649733):
OK. kdev=8:5, mq=0.

we have no way to reproduce the error consistently, and after this error
happens the database stops responding to connection requests closing even
the existings.

Then after 15-20 minutes the problem stops and the database works again all
by itself.
Is there anything we can do to solve this ?


pgsql-bugs by date:

Previous
From: "wangw.fnst@fujitsu.com"
Date:
Subject: RE: Logical Replica ReorderBuffer Size Accounting Issues
Next
From: Tom Lane
Date:
Subject: Re: BUG #17743: Bad RIP VALUE