Chapter 34. Troubleshooting

Postgres Pro offers the ability to dump the state of a backend process, which can be useful for diagnostic and debugging purposes, by enabling the crash_info configuration parameter. To set it up, follow these steps:

  • Create a directory on each cluster node that the Postgres Pro Shardman operating system user has access to (usually, it is postgres). Error reports will be sent to this directory.

    install -d -o postgres -g postgres -m 700  /var/lib/postgresql/crashinfo
    

  • Set the crash_info_location value.

    Note

    This will cause the DBMS to restart.

    shardmanctl --store-endpoints http://etcdserver:2379 set -y  crash_info_location=/var/lib/postgresql/crashinfo
    

  • To make sure the changes are applied, send a signal that will cause the backend failure and a core dump creation, along with the instance restart.

    Note

    Do it in your test environment only.

Connect to your DBMS and find out PID of the backend associated with the current session:

postgres=# select pg_backend_pid();
pg_backend_pid
----------------
    23770

Then send the SIGSEGV signal to the process with the received PID:

kill -11 23770

This will result in this backend crash, and a log file with the time, backtrace and cause of an error will be written to /var/lib/postgresql/crashinfo:

 # Signal
Program received signal: 11 (SIGSEGV)
Signal    UTC date time: 25.10.2024 08:37:02


# Program
                        pid: 23770
                        ppid: 17506
    program_invocation_name: postgres: postgres postgres 10.42.42.10(34202) idle
program_invocation_short_name: tgres 10.42.42.10(34202) idle
                    exe_path: /opt/pgpro/sdm-17/bin/postgres
                        exe: postgres

# Backtrace
1   postgres + 0x5b55c0              0x55c5ba8459b7  0x00007ffcbef19070  bt_crash_handler + 0x3f7
2   libc.so.6 + 0x4251f              0x7f01c2caa520  0x00007ffcbef19140  __sigaction + 0x50
unknown  ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
3   libc.so.6 + 0x125f80             0x7f01c2d8df9a  0x00007ffcbef195b8  epoll_wait + 0x1a
epoll_wait  ../sysdeps/unix/sysv/linux/epoll_wait.c:30
4   postgres + 0x433870              0x55c5ba6c39bb  0x00007ffcbef195c0  WaitEventSetWait + 0x14b
5   postgres + 0x320de0              0x55c5ba5b0e74  0x00007ffcbef19650  secure_read + 0x94
6   postgres + 0x327d20              0x55c5ba5b7dae  0x00007ffcbef196a0  pq_recvbuf + 0x8e
7   postgres + 0x328980              0x55c5ba5b8995  0x00007ffcbef196c0  pq_getbyte + 0x15
8   postgres + 0x457da0              0x55c5ba6e909c  0x00007ffcbef196d0  PostgresMain + 0x12fc
9   postgres + 0x3ce210              0x55c5ba65ef86  0x00007ffcbef19a60  ServerLoop + 0xd76
10  postgres + 0x3cf240              0x55c5ba65fe18  0x00007ffcbef1a040  PostmasterMain + 0xbd8
11  postgres + 0x14ecc0              0x55c5ba3df182  0x00007ffcbef1a0c0  main + 0x4c2
12  libc.so.6 + 0x29d10              0x7f01c2c91d90  0x00007ffcbef1a0f0  __libc_init_first + 0x90
__libc_start_call_main  ../sysdeps/nptl/libc_start_call_main.h:58
13  libc.so.6 + 0x29dc0              0x7f01c2c91e40  0x00007ffcbef1a190  __libc_start_main + 0x80
call_init  ../csu/libc-start.c:128
__libc_start_main_impl  ../csu/libc-start.c:379
14  postgres + 0x14f200              0x55c5ba3df225  0x00007ffcbef1a1e0  _start + 0x25

The dump state file can be generated in one of the following ways:

  • By sending the signal 40 (also known as the diagnostic dump signal):

    kill -40 backend_pid
    
  • Using the pg_diagdump() function:

    SELECT pg_diagdump(backend_pid);
    

Here backend_pid is the process ID of the backend process to dump.

As a result, Postgres Pro will write the state dump to a file in the $PGDATA/crash_info directory by default or in the directory specified in the crash_info_location configuration parameter. The file will be named following this pattern: crash_file_id_pidpid.state. You can set the data sources to provide data for a crash dump in the crash_info_dump configuration parameter.

The below example shows how to generate and inspect the state dump file for the backend with PID 23111:

-- Generate the state dump file
SELECT pg_diagdump(23111);

-- Inspect crash_info directory and its contents
SELECT pg_ls_dir('crash_info');

-- Read the contents of the state dump file
SELECT pg_read_file('crash_info/crash_1722943138419104_pid23111.state');