Re: [HACKERS] Deadlock in XLogInsert at AIX - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: [HACKERS] Deadlock in XLogInsert at AIX
Date
Msg-id 58924463.8020505@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] Deadlock in XLogInsert at AIX  ("REIX, Tony" <tony.reix@atos.net>)
Responses Re: [HACKERS] Deadlock in XLogInsert at AIX
Re: [HACKERS] Deadlock in XLogInsert at AIX
List pgsql-hackers
On 02/01/2017 08:30 PM, REIX, Tony wrote:

Hi Konstantin,

Please run: /opt/IBM/xlc/13.1.3/bin/xlc -qversion  so that I know your exact XLC v13 version.

IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)

I'm building on Power7 and not giving any architecture flag to XLC.

I'm not using -qalign=natural . Thus, by default, XLC use -qalign=power, which is close to natural, as explained at:
         https://www.ibm.com/support/knowledgecenter/SSGH2K_13.1.0/com.ibm.xlc131.aix.doc/compiler_ref/opt_align.html
Why are you using this flag ?


Because otherwise double type is aligned on 4 bytes.

Thanks for info about pgbench. PostgreSQL web-site contains a lot of old information...

If you could share scripts or instructions about the tests you are doing with pgbench, I would reproduce here.


You do not need any script.
Just two simple commands.
One to initialize database:

pgbench -i -s 1000

And another to run benchmark itself:

pgbench -c 100 -j 20 -P 1 -T 1000000000


I have no "real" application. My job consists in porting OpenSource packages on AIX. Many packages. Erlang, Go, these days. I just want to make PostgreSQL RPMs as good as possible... within the limited amount of time I can give to this package, before moving to another one.

About the zombie issue, I've discussed with my colleagues. Looks like the process keeps zombie till the father looks at its status. However, though I did that several times, I  do not remember well the details. And that should be not specific to AIX. I'll discuss with another colleague, tomorrow, who should understand this better than me.


1. Process is not in zomby state (according to ps). It is in <exiting> state... It is something AIX specific, I have not see processes in this state at Linux.
2. I have implemented simple test - forkbomb. It creates 1000 children and then wait for them. It is about ten times slower than at Intel/Linux, but still much faster than 100 seconds. So there is some difference between postgress backend and dummy process doing nothing - just immediately terminating after return from fork()

Patch for Large Files: When building PostgreSQL, I found required to use the following patch so that PostgreSQL works with large files. I do not remember the details. Do you agree with such a patch ? 1rst version (new-...) shows the exact places where   define _LARGE_FILES 1  is required.  2nd version (new2-...) is simpler.

I'm now experimenting with your patch for dead lock. However, that should be invisible with the  "check-world" tests I guess.

Regards,

Tony

Le 01/02/2017 à 16:59, Konstantin Knizhnik a écrit :
Hi Tony,

On 01.02.2017 18:42, REIX, Tony wrote:

Hi Konstantin

XLC.

I'm on AIX 7.1 for now.

I'm using this version of XLC v13:

# xlc -qversion
IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07)
Version: 13.01.0003.0003

With this version, I have (at least, since I tested with "check" and not "check-world" at that time) 2 failing tests: create_aggregate , aggregates .


With the following XLC v12 version, I have NO test failure:

# /usr/vac/bin/xlc -qversion
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0016


So maybe you are not using XLC v13.1.3.3, rather another sub-version. Unless you are using more options for the configure ?


Configure.

What are the options that you give to the configure ?


export CC="/opt/IBM/xlc/13.1.3/bin/xlc"
export CFLAGS="-qarch=pwr8 -qtune=pwr8 -O2 -qalign=natural -q64 "
export LDFLAGS="-Wl,-bbigtoc,-b64"
export AR="/usr/bin/ar -X64"
export LD="/usr/bin/ld -b64 "
export NM="/usr/bin/nm -X64"
./configure --prefix="/opt/postgresql/xlc-debug/9.6"


Hard load & 64 cores ? OK. That clearly explains why I do not see this issue.


pgbench ? I wanted to run it. However, I'm still looking where to get it plus a guide for using it for testing.


pgbench is part of Postgres distributive (src/bin/pgbench)


I would add such tests when building my PostgreSQL RPMs on AIX. So any help is welcome !


Performance.

- Also, I'd like to compare PostgreSQL performance on AIX vs Linux/PPC64. Any idea how I should proceed ? Any PostgreSQL performance benchmark that I could find and use ? pgbench ?

pgbench is most widely used tool simulating OLTP workload. Certainly it is quite primitive and its results are rather artificial. TPC-C seems to be better choice.
But the best case is to implement your own benchmark simulating actual workload of your real application.

- I'm interested in any information for improving the performance & quality of my PostgreSQM RPMs on AIX. (As I already said, BullFreeware RPMs for AIX are free and can be used by anyone, like Perzl RPMs are. My company (ATOS/Bull) sells IBM Power machines under the Escala brand since ages (25 years this year)).


How to help ?

How could I help for improving the quality and performance of PostgreSQL on AIX ?


We still have one open issue at AIX: see https://www.mail-archive.com/pgsql-hackers@postgresql.org/msg303094.html
It will be great if you can somehow help to fix this problem.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

Previous
From: Konstantin Knizhnik
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Konstantin Knizhnik
Date:
Subject: Re: [HACKERS] Deadlock in XLogInsert at AIX