Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this? - Mailing list pgsql-performance

From Graeme B. Bell
Subject Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Date
Msg-id A1B7E390-BE47-4753-8557-D244D0202C25@skogoglandskap.no
Whole thread Raw
Responses Re: Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?  (Marc Mamin <M.Mamin@intershop.de>)
Re: Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-performance
Hi everyone,

I've written a new open source tool for easily parallelising SQL scripts in postgres.   [obligatory plug:
https://github.com/gbb/par_psql  ] 

Using it, I'm seeing a problem that I've also seen in other postgres projects involving high degrees of parallelisation
inthe last 12 months. 

Basically:

- I have machines here with up to 16 CPU cores and 128GB memory, very fast SSDs and controller etc, carefully
configuredkernel/postgresql.conf for high performance. 

- Ordinary queries parallelise nearly perfectly (e.g. SELECT some_stuff ...), e.g. almost up to 16x performance
improvement.

- Non-DB stuff like GDAL, python etc. parallelise nearly perfectly.

- HOWEVER calls to CPU-intensive user-defined pl/pgsql functions (e.g. SELECT myfunction(some_stuff)) do not
parallelisewell, even when they are independently defined functions, or accessing tables in a read-only way. They hit a
limitof 2.5x performance improvement relative to single-CPU performance (pg9.4) and merely 2x performance (pg9.3)
regardlessof how many CPU cores I throw at them. This is about 6 times slower than I'm expecting.  


I can't see what would be locking. It seems like it's the pl/pgsql environment itself that is somehow locking or
incurringsome huge frictional costs. Whether I use independently defined functions, independent source tables,
independentoutput tables, makes no difference whatsoever, so it doesn't feel 'lock-related'. It also doesn't seem to be
WAL/synchronisationrelated, as the machines I'm using can hit absurdly high pgbench rates, and I'm using unlogged
tablesfor output.  

Take a quick peek here: https://github.com/gbb/par_psql/blob/master/BENCHMARKS.md

I'm wondering what I'm missing here. Any ideas?

Graeme.

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Sudden connection and load average spikes with postgresql 9.3
Next
From: Marc Mamin
Date:
Subject: Re: Hmmm... why does pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?