Re: Server crash on RHEL 9/s390x platform against PG16 - Mailing list pgsql-hackers

From Suraj Kharage
Subject Re: Server crash on RHEL 9/s390x platform against PG16
Date
Msg-id CAF1DzPU_QUXO4S_jAcRJs+1O1GzNVKDe6KWHiv2Bz7HSHiz-vA@mail.gmail.com
Whole thread Raw
In response to Re: Server crash on RHEL 9/s390x platform against PG16  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers


On Sat, Oct 21, 2023 at 5:17 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote:
> *[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2
> (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
> s390x  CPU op-mode(s):       32-bit, 64-bit  Address sizes:        39 bits

Can you provide the rest of the lscpu output?  There have been issues with Z14
vs Z15:
https://github.com/llvm/llvm-project/issues/53009

You're apparently not hitting that, but given that fact, you either are on a
slightly older CPU, or you have applied a patch to work around it. Because
otherwise your uild instructions below would hit that problem, I think.


> physical, 48 bits virtual  Byte Order:           Big Endian*
> *Configure command:*
> ./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm
> --with-perl --with-python --with-tcl --with-openssl --enable-nls
> --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
> --enable-debug --enable-cassert --with-pgport=5414

Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have you
verified the issue reproduces on upstream postgres?

Yes, I can reproduce this on upstream postgres master and v16 branch.

Here are details:

./configure --prefix=/home/edb/postgres/ --with-zstd --with-llvm --with-perl --with-python --with-tcl --with-openssl --enable-nls --with-libxml --with-libxslt --with-systemd --without-icu --enable-debug --enable-cassert --with-pgport=5414 CFLAGS="-g -O0"



[edb@9428da9d2137 postgres]$ cat /etc/redhat-release

AlmaLinux release 9.2 (Turquoise Kodkod)


[edb@9428da9d2137 edbas]$ lscpu

Architecture:           s390x

  CPU op-mode(s):       32-bit, 64-bit

  Address sizes:        39 bits physical, 48 bits virtual

  Byte Order:           Big Endian

CPU(s):                 9

  On-line CPU(s) list:  0-8

Vendor ID:              GenuineIntel

  Model name:           Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz

    CPU family:         6

    Model:              158

    Thread(s) per core: 1

    Core(s) per socket: 1

    Socket(s):          9

    Stepping:           10

    BogoMIPS:           5200.00

    Flags:              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx

                        16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 avx2 bmi2 erms xsaveopt arat

Caches (sum of all):    

  L1d:                  288 KiB (9 instances)

  L1i:                  288 KiB (9 instances)

  L2:                   2.3 MiB (9 instances)

  L3:                   108 MiB (9 instances)

Vulnerabilities:        

  Itlb multihit:        KVM: Mitigation: VMX unsupported

  L1tf:                 Mitigation; PTE Inversion

  Mds:                  Vulnerable; SMT Host state unknown

  Meltdown:             Vulnerable

  Mmio stale data:      Vulnerable

  Spec store bypass:    Vulnerable

  Spectre v1:           Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers

  Spectre v2:           Vulnerable, STIBP: disabled

  Srbds:                Unknown: Dependent on hypervisor status

  Tsx async abort:      Not affected


[edb@9428da9d2137 postgres]$ clang --version

clang version 15.0.7 (Red Hat 15.0.7-2.el9)

Target: s390x-ibm-linux-gnu

Thread model: posix

InstalledDir: /usr/bin


[edb@9428da9d2137 postgres]$ rpm -qa | grep llvm

llvm-libs-15.0.7-1.el9.s390x

llvm-15.0.7-1.el9.s390x

llvm-test-15.0.7-1.el9.s390x

llvm-static-15.0.7-1.el9.s390x

llvm-devel-15.0.7-1.el9.s390x

 
Please let me know if any further information is required.


>
> *Test case:*
> CREATE TABLE rm32044_t1
> (
>     pkey   integer,
>     val  text
> );
> CREATE TABLE rm32044_t2
> (
>     pkey   integer,
>     label  text,
>     hidden boolean
> );
> CREATE TABLE rm32044_t3
> (
>         pkey integer,
>         val integer
> );
> CREATE TABLE rm32044_t4
> (
>         pkey integer
> );
> insert into rm32044_t1 values ( 1 , 'row1');
> insert into rm32044_t1 values ( 2 , 'row2');
> insert into rm32044_t2 values ( 1 , 'hidden', true);
> insert into rm32044_t2 values ( 2 , 'visible', false);
> insert into rm32044_t3 values (1 , 1);
> insert into rm32044_t3 values (2 , 1);
>
> postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
> = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
> rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.

I tried this on both master and 16, without hitting this issue.

If you can reproduce the issue on upstream postgres, can you share more about
your configuration?

Greetings,

Andres Freund


--
--

Thanks & Regards, 
Suraj kharage, 

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Fix output of zero privileges in psql
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: pg_upgrade's interaction with pg_resetwal seems confusing