Re: "PANIC: could not open critical system index 2662" - twice - Mailing list pgsql-general
From | Evgeny Morozov |
---|---|
Subject | Re: "PANIC: could not open critical system index 2662" - twice |
Date | |
Msg-id | 01020187f6fa8f05-d1bd9975-48ec-4d8d-9ab7-75478400100d-000000@eu-west-1.amazonses.com Whole thread Raw |
In response to | Re: "PANIC: could not open critical system index 2662" - twice (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: "PANIC: could not open critical system index 2662" - twice
Re: "PANIC: could not open critical system index 2662" - twice |
List | pgsql-general |
On 6/05/2023 11:13 pm, Thomas Munro wrote: > Did you previously run this same workload on versions < 15 and never > see any problem? Yes, kind of. We have a test suite that creates one test DB and runs a bunch of tests on it. Two of these tests, however, create another DB each (also by cloning the same template DB) in order to test copying data between DBs. It's only these "extra" DBs that were corrupted, at least on this occasion. (Hard to say about the last time, because that time it all went south and the whole server crashed, and we may have had some residual corruption from bad disks then - who knows.) I'm not sure whether the tests that created the extra DBs existed before we upgraded to PG 15, but we definitely have not seen such problems on PG 13 or 14. > It seems like you have some kind of high frequency testing workload that creates and tests databases all day long, andjust occasionally detects this corruption. Maybe 10-30 times per day normally, depending on the day. However, I have tried to repro this by running those two specific tests thousands of times in one day, without success. > Would you like to try requesting FILE_COPY for a while and see if it eventually happens like that too? Sure, we can try that. On 7/05/2023 12:30 pm, Thomas Munro wrote: > your "zfs get all /path/to/pgdata" PROPERTY VALUE SOURCE type filesystem - creation Mon Mar 6 17:07 2023 - used 166G - available 2.34T - referenced 166G - compressratio 2.40x - mounted yes - quota none default reservation none default recordsize 16K local mountpoint /default sharenfs off default checksum on default compression lz4 received atime off inherited from pgdata devices on default exec off inherited from pgdata setuid off inherited from pgdata readonly off default zoned off default snapdir hidden default aclinherit restricted default createtxg 90 - canmount on received xattr on default copies 1 default version 5 - utf8only off - normalization none - casesensitivity sensitive - vscan off default nbmand off default sharesmb off default refquota none default refreservation none default primarycache all default secondarycache all default usedbysnapshots 199M - usedbydataset 166G - usedbychildren 0B - usedbyrefreservation 0B - logbias latency default dedup off default mlslabel none default sync standard default dnodesize legacy default refcompressratio 2.40x - written 64.9M - logicalused 397G - logicalreferenced 397G - volmode default default filesystem_limit none default snapshot_limit none default filesystem_count none default snapshot_count none default snapdev hidden default acltype off default context none default fscontext none default defcontext none default rootcontext none default relatime off default redundant_metadata all default overlay off default > your postgresql.conf? We have a bunch of config files, so I tried to get the resulting config using "select name, setting from pg_settings where source = 'configuration file'" - hopefully that gives what you wanted. name | setting ----------------------------+------------------------------------------------------- archive_command | pgbackrest --stanza="behavior-pg15" archive-push "%p" archive_mode | on archive_timeout | 900 cluster_name | 15/behavior DateStyle | ISO, MDY default_text_search_config | pg_catalog.english dynamic_shared_memory_type | posix external_pid_file | /var/run/postgresql/15-behavior.pid full_page_writes | off lc_messages | C lc_monetary | C lc_numeric | C lc_time | C listen_addresses | * log_checkpoints | on log_connections | on log_disconnections | on log_file_mode | 0640 log_line_prefix | %m [%p] %q%u@%d log_lock_waits | on log_min_duration_statement | 1000 log_temp_files | 0 log_timezone | Etc/UTC maintenance_work_mem | 1048576 max_connections | 100 max_slot_wal_keep_size | 30000 max_wal_size | 1024 min_wal_size | 80 port | 5434 shared_buffers | 4194304 ssl | on ssl_cert_file | (redacted) ssl_ciphers | TLSv1.2:TLSv1.3:!aNULL ssl_dh_params_file | (redacted) ssl_key_file | (redacted) ssl_min_protocol_version | TLSv1.2 temp_buffers | 10240 TimeZone | Etc/UTC unix_socket_directories | /var/run/postgresql wal_compression | pglz wal_init_zero | off wal_level | replica wal_recycle | off work_mem | 262144 > And your exact Ubuntu kernel version and ZFS package versions? Ubuntu 18.04.6 Kernel 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux zfsutils-linux package version 0.7.5-1ubuntu16.12 amd64
pgsql-general by date: