Another update and fun fact. After a few days of database servers being idle, I no longer see these errors.
For the sake of additional testing, I rebooted one of the machines in the cluster and immediately started seeing errors again.
Looks more like an OS issue than PostgreSQL itself, but I have no idea where to look. The weird thing is that the issue always happens on libkrb5.so.3.3.
I'm completely lost here.
I can confirm that the underlying hardware (on the hypervisor) is OK.
Any additional ideas would be of help. Thanks
Ivan Kalafatić
On Tue, 30 Jan 2024 at 21:22, Ivan Kalafatić <ikalafat@gmail.com> wrote:
Additional info: in the meantime, I've tried adding require_auth=none and gssencmode=disable connection string parameters (separately, not together at once) for the repmgr/client connections that are "trusted" through pg_hba.conf, but the problem persisted.
I am intermittently getting following errors (taken from var/log/syslog), on pg_basebackup, repmgrd and psql (so far)
repmgr[1147164] general protection fault ip:7fb9950981b5 sp:7ffecf4418f8 error:0 in libkrb5.so.3.3[7fb99506a000+5f000] also this:
pg_basebackup[4191295] general protection fault ip:7f344e9771b5 sp:7ffe8c625c28 error:0 in libkrb5.so.3.3[7f344e949000+5f000]
and this
psql[4045720] general protection fault ip:7f89130931b5 sp:7fffd40a0268 error:0 in libkrb5.so.3.3[7f8913065000+5f000]
Looks like that the error is related to GSSAPI/kerberos auth? I have tried replacing "host" type within pg_hba.conf to "hostnogssenc", but problem persists.
Running Debian 12 with all updates, repmgr, timescaledb extensions (fresh install). PostgreSQL 16.1 (Debian 16.1-1.pgdg120+1) Looks like the server itself is not affected (isn't crashing etc).
Is there anything that I could do to disable GSSAPI/kerberos and/or circumvent this issue? Also, how can I provide more info for this issue to be resolved.