Re: Linux/PostgreSQL scalability issue - problem with 8 cores - Mailing list pgsql-performance

From Jakub Ouhrabka
Subject Re: Linux/PostgreSQL scalability issue - problem with 8 cores
Date
Msg-id flj7ba$q7o$1@news.hub.org
Whole thread Raw
In response to Re: Linux/PostgreSQL scalability issue - problem with 8 cores  (Sven Geisler <sgeisler@aeccom.com>)
Responses Re: Linux/PostgreSQL scalability issue - problem with 8 cores
List pgsql-performance
Hi Sven,

 > I guess all backends do listen to the same notification.

Unfortunatelly no. The backends are listening to different notifications
in different databases. Usually there are only few listens per database
with only one exception - there are many (hundreds) listens in one
database but all for different notifications.

 > Can you change your implementation?
 > - split you problem - create multiple notification if possible

Yes, it is like this.

 > - do an UNLISTEN if possible

Yes, we're issuing unlistens when appropriate.

 > - use another signalisation technique

We're planning to reduce the number of databases/backends/listens but
anyway we'd like to run system on 8 cores if it is running without any
problems on 2 cores...

Thanks for the suggestions!

Kuba

Sven Geisler napsal(a):
> Hi Jakub,
>
> I do have a similar server (from DELL), which performance well with our
> PostgreSQL application. I guess the peak in context switches is the only
> think you can see.
>
> Anyhow, I think it is you're LISTEN/NOTIFY approach which cause that
> behaviour. I guess all backends do listen to the same notification.
> I don't know the exact implementation, but I can imagine that all
> backends are access the same section in the shared memory which cause
> the increase of context switches. More cores means more access at the
> same time.
>
> Can you change your implementation?
> - split you problem - create multiple notification if possible
> - do an UNLISTEN if possible
> - use another signalisation technique
>
> Regards
> Sven
>
>
> Jakub Ouhrabka schrieb:
>> Hi all,
>>
>> we have a PostgreSQL dedicated Linux server with 8 cores (2xX5355). We
>> came accross a strange issue: when running with all 8 cores enabled
>> approximatly once a minute (period differs) the system is very busy for
>> a few seconds (~5-10s) and we don't know why - this issue don't show up
>> when we tell Linux to use only 2 cores, with 4 cores the problem is here
>> but it is still better than with 8 cores - all on the same machine, same
>> config, same workload. We don't see any apparent reason for these peaks.
>> We'd like to investigate it further but we don't know what to try next.
>> Any suggenstions? Any tunning tips for Linux+PostgreSQL on 8-way system?
>> Can this be connected with our heavy use of listen/notify and hundreds
>> backends in listen mode?
>>
>> More details are below.
>>
>> Thanks,
>>
>> Kuba
>>
>> System: HP DL360 2x5355, 8G RAM, P600+MSA50 - internal 2x72GB RAID 10
>> for OS, 10x72G disks RAID 10 for PostgreSQL data and wal
>> OS: Linux 2.6 64bit (kernel 2.6.21, 22, 23 makes little difference)
>> PostgreSQL: 8.2.4 (64bit), shared buffers 1G
>>
>> Nothing else than PostgreSQL is running on the server. Cca 800
>> concurrent backends. Majority of backends in LISTEN doing nothing.
>> Client interface for most backends is ecpg+libpq.
>>
>> Problem description:
>>
>> The system is usually running 80-95% idle. Approximatly once a minute
>> for cca 5-10s there is a peak in activity which looks like this:
>>
>> vmstat (and top or atop) reports 0% idle, 100% in user mode, very low
>> iowait, low IO activity, higher number of contex switches than usual but
>> not exceedingly high (2000-4000cs/s, usually 1500cs/s), few hundreds
>> waiting processes per second (usually 0-1/s). From looking at top and
>> running processes we can't see any obvious reason for the peak.
>> According to PostgreSQL log the long running commands from these moments
>> are e.g. begin transaction lasting several seconds.
>>
>> When only 2 cores are enabled (kernel command line) then everything is
>> running smoothly. 4 cores exibits slightly better behavior than 8 cores
>> but worse than 2 cores - the peaks are visible.
>>
>> We've tried kernel versions 2.6.21-23 (latest revisions as of beginning
>> December from kernel.org) the pattern slightly changed but it may also
>> be that the workload slightly changed.
>>
>> pgbench or any other stress testing runs smoothly on the server.
>>
>> The o usage panly strange thing about ourttern I can think of is heavy
>> use of LISTEN/NOTIFY especially hunderds backends in listen mode.
>>
>> When restarting our connected clients the peaks are not there from time
>> 0, they are visible after a while - seems something gets synchronized
>> and causing troubles then.
>>
>> Since the server is PostgreSQL dedicated and no our client applications
>> are running on it - and there is a difference when 2 and 8 cores are
>> enabled -  we think that the peaks are not caused by our client
>> applications.
>>
>> How can we diagnose what is happening during the peaks?
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>       subscribe-nomail command to majordomo@postgresql.org so that your
>>       message can get through to the mailing list cleanly
>

pgsql-performance by date:

Previous
From: Jakub Ouhrabka
Date:
Subject: Re: Linux/PostgreSQL scalability issue - problem with 8 cores
Next
From: Alvaro Herrera
Date:
Subject: Re: Linux/PostgreSQL scalability issue - problem with 8 cores