Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance - Mailing list pgsql-performance

From Jon Nelson
Subject Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance
Date
Msg-id AANLkTimpueKcj3VXR30Ecse5sc55qk8b_vsc_4xgxuG-@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance  (Ivan Voras <ivoras@freebsd.org>)
List pgsql-performance
On Wed, Oct 6, 2010 at 5:31 PM, Ivan Voras <ivoras@freebsd.org> wrote:
> On 10/04/10 20:49, Josh Berkus wrote:
>
>>> The other major bottleneck they ran into was a kernel one: reading from
>>> the heap file requires a couple lseek operations, and Linux acquires a
>>> mutex on the inode to do that. The proper place to fix this is
>>> certainly in the kernel but it may be possible to work around in
>>> Postgres.
>>
>> Or we could complain to Kernel.org.  They've been fairly responsive in
>> the past.  Too bad this didn't get posted earlier; I just got back from
>> LinuxCon.
>>
>> So you know someone who can speak technically to this issue? I can put
>> them in touch with the Linux geeks in charge of that part of the kernel
>> code.
>
> Hmmm... lseek? As in "lseek() then read() or write()" idiom? It AFAIK
> cannot be fixed since you're modifying the global "strean position"
> variable and something has got to lock that.
>
> OTOH, pread() / pwrite() don't have to do that.

While lseek is very "cheap" it is like any other system call in that
when you multiple "cheap" times "a jillion" you end up with "notable"
or even "lots". I've personally seen notable performance improvements
by switching to pread/pwrite instead of lseek+{read,write}. For
platforms that don't implement pread or pwrite, wrapper calls are
trivial to produce. One less system call is, in this case, 50% fewer.


--
Jon

pgsql-performance by date:

Previous
From: Ivan Voras
Date:
Subject: Re: [HACKERS] MIT benchmarks pgsql multicore (up to 48)performance
Next
From: Josh Berkus
Date:
Subject: XFS vs Ext3, and schedulers, for WAL